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Re ference To A Related Application 

Th pres nt application is a continuation-in-part of our copending U.S. Patent Application Serial No. 
07/866,045, filed on April 9, 1992, which is incorporated by referenc in its ntir ty. 

5 

Background of the Invention 

The present invention concerns non-A, non-B hepatitis (hereinafter called NANB hepatitis) virus 
genome, polynucleotides, polypeptides, related antigen, antibody and detection systems for detecting 
70 NANB antigens or antibodies. 

Viral hepatitis of which DNA and RNA of the causative viruses have been elucidated, and their 
diagnosis and even prevention in some have been established, are hepatitis A and hepatitis B. The general 
name NANB hepatitis was given to the other forms of viral hepatitis. 

Post-transfusion hepatitis was remarkably reduced after introduction of diagnostic systems for screening 
75 hepatitis B in transfusion bloods. However, there are still an estimated 280.000 annual cases of post- 
transfusion hepatitis caused by NANB hepatitis in Japan. 

NANB hepatitis viruses were recently named C.D and E according to their types, and scientists started 
a world wide effort to conduct research for the causative viruses and subsequent extermination of the 
causative viruses. 

20 In 1988, Chiron Corp. claimed that they had succeeded in cloning RNA virus genome, which they 
termed hepatitis C virus (hereinafter called HCV), as the causative agent of NANB hepatitis and reported on 
its nucleotide sequence (British Patent 2,212,511 which is the equivalent of European Patent Application 
0,318,216). HCV (C100-3) antibody detection systems based on the sequence are now being introduced for 
screening of transfusion bloods and for diagnosis of patients in Japan and in many other countries. The 

25 detection systems for the C100-3 antibody have proven their partial association with NANB hepatitis; 
however, they capture only about 70% of carriers and chronic hepatitis patients, or they fail to detect the 
antibody in acute phase infection, thus leaving problems yet to be solved even after development of the 
C100-3 antibody by Chiron Corp. 

The course of NANB hepatitis is troublesome and most patients are considered to become carriers, 

30 then to develop chronic hepatitis. In addition, most patients with chronic hepatitis develop liver cirrhosis, 
then hepatocellular carcinoma. It is therefore very imperative to isolate the virus itself and to develop 
effective diagnostic reagents enabling earlier diagnosis. 

The presence of a number of NANB hepatitis which cannot be diagnosed by Chiron's C100-3 antibody 
detection kits suggests a possibility of a difference in subtype between Chiron's HCV and Japanese NANB 

35 hepatitis virus. 

In order to develop NANB hepatitis diagnostic kits of more specificity and to develop effective vaccines, 
it becomes an absolutely important task to analyze each subtype of NANB hepatitis causative virus at its 
genetic and corresponding amino acid level. 

40 Summary of the Invention 

An object of the present invention is to provide the nucleotide sequence coding for the structural protein 
of NANB hepatitis virus and, with such information, to analyze amino acids of the protein to locate and 
provide polypeptides useful as antigen for establishment of detection systems for NANB virus, its related 

45 antigens and antibodies. 

A further object of the present invention is to locate polynucleotides essential to treatment, prevention 
and diagnosis, and polypeptides effective as antigens, by isolating NANB hepatitis virus RNA from human 
and chimpanzee virus carriers, cloning the cDNA covering the whole structural gene of the virus to 
determine its nucleotide sequence, and studying the amino acid sequence of the cDNA. As a result, the 

so inventors have determined the nucleotides of the whole genome of a strain of NANB virus called HC-J6 and 
a strain called HC-J8. NANB hepatitis virus genome of HC-J6 and HC-J8 differ from that of Chiron's HCV. 

Brief Description of the Drawings 

55 Figur 1 shows the restriction map and structure of the coding region of NANB hepatitis virus genome 
(HC-J6) and positions of clon s. C, E, NS-1, NS-2, NS-3. NS-4 and NS-5 are th abbreviation of core, 
env lope, non-structur -1 , -2, -3, -4 and -5. 
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Figures 2 to 4 show m thod of determination of the nucleotid sequence of 5' t rminus of NANB 
hepatitis virus g nome of strains HC-J1. HC-J4 and HC-J6 respectively. 

Fiqur 5 shows th method of d termination of th nucleotid s quence of 3 t rm.nus of HC-J6 
genome. Solid lines show nucleotid sequences d termin d by clon s from libraries of bact nophag 
lambda gtlO. and broken lines show nucleotide sequences determined by clones obtamed by I PCR. 

Figure 6 shows the structure of coding region of NANB hepatitis virus genome (HC-J8) and positons of 
clones. Regions a to n indicate positions of amplification by PCR. 

D etailed Description of the Invention 

The present invention provides NANB hepatitis virus genome RNA for strain HC-J6 (sequence IW M) 
consisting of 340 nucleotides on the 5' terminus that follow an open reading frame consishng of 9099 
nucleotides coding for the structural protein and non-structural protein that follow « noncod.ng reg.on 
consisting of 150 nucleotides containing an U-stretch consisting of 108 uracils on the 3' M^NMB 
hepatitis virus, and NANB hepatitis virus genome having substantially the nucleotide sequence of sequence 

llSt Vhe present invention provides polynucleotide N-9589 (strain HC-J6) comprising the DNA nucleotide 
seauence of sequence list 2; cDNA clone J6-081 comprising the nucleotide sequence of sequence list 3; 
cDNA clone Jf>08 comprising the nucleotide sequence of sequence list 4; and NANB hepatrhs virus 
polynucleotides having substantially the sequence of nucleotides of NANB hepatitis virus nucleotdes shown 

Te^e'rSn^rdfsUpeptide coded for by genome or polynucleotide of HC-J6 abov. polypep- 
tide P-J6-3033. comprising the polypeptide sequence of sequence list 5. polypeptides produced by using 
recombinant genome, recombinant polynucleotides and recombinant cDNA of whole or a part of cDNA 
above, and polyclonal or monoclonal antibodies against the polypeptides described above^ 

The present invention also provides NANB hepatitis virus genome for strain HC-J8 compnsmg 
seouence list 6 NANB hepatitis virus RNA consisting of noncoding region consisting of 341 nucleotides on 
5" terminus followed by an open reading frame consisting of 9099 nucleotides coding for the structural 
protein and non-structural protein followed by a noncoding region consisting of 71 nucleotides containing an 
U-stretch consisting of 30 uracils on 3' terminus of NANB hepatitis virus comprising sequence list 6. and 
NANB hepatitis virus genome having substantially the nucleotide sequence of sequence hst 6. 

The present invention provides polynucleotide N-9511 for strain HOJ8 comprising the DNA nucleot.de 
sequence of sequence list 7 and NANB hepatitis virus polynucleotide having substantially the sequence of 
nucleotides of NANB hepatitis virus nucleotides comprising sequence list 7. 

The invention provides polypeptide coded for by genome or polynucleotide of 
tide P-J8-3033. comprising the polypeptide sequence of sequence list 8 and polypeptide P-J8-3033-2 
comprising the potypeptide sequence of sequence list 9. polypeptides prcdu^ by ^^ 
genome recombinant polynucleotides and recombinant cDNA of whole or a part of cDNA above, and 
polyclonal or monoclonal antibodies against the polypeptides descnbed above. 

The present invention, furthermore, provides NANB hepatitis diagnostic system us.ng polypeptides or 

antibodies described above. . . ... 0 „ H Ho 

ln the method described below, NANB hepatitis virus RNA of the present invent.on was obtained and its 

nucleotide sequence was determined. . ^ . ; „„„_ H r 

Plasma samples (HC-J1. HC-J4, HC-J6 and HC-J8) were obtained from human and chimpanzee HC- 
J1 HC-J6 and HC-J8 were obtained from Japanese blood donors who had tested positive for HCV 
antibody. HC-J4 was obtained from the chimpanzee subjected to the challenge test but was negative for 
Chiron's C100-3 antibody previously mentioned. 

RNA was isolated from each of the plasma samples. Following the study of 5' terminus of approx,- 
mately 2,500 nucleotides and 3' terminus of approximately 1.100 nucleotides disclosed in Japanese paten 
application No. 196175/91. the inventors have completed the study of the region coding for non-structural 
pSS of strain HOJ6 and the study of the full length sequence of 9.589 nucleotides of HC-J genome 
RNA and have completed the study of the region coding for non-structural protein of strain HC-J8 and the 
study of the full length sequence of 9.589 nucleotides of HC-J8 genome RNA. 

As described in the Exampl below, strain HC-J6 had a 5' noncoding r g.on oormsbng of 340 
nucleotides, and strain HC-J8 had a 5' noncoding region consisting of 341 nucleotides, followed by reg.on 
coding for structural protein and region coding for non-structural protein. niirlBnti H«« 

Concerning the 3' terminus, strain HC-J6 was found to hav a region cons.st.ng of 150 nucleotides 
containing an U-stretch consisting of 108 uracils following after the region coding for non-structural protein 
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and strain HC-J8 was found to hav a r gion consisting of 71 nucleotid s containing an U-stretch consisting 
of 30 uracils following after the region coding for non-structural P™*""- C . J6 and 

as 76.9/.. On the otner approximately 3,000 nucleotides of 5' terminus. 

region (1050 n«»oftM«0 amino aads ol ■ n >™«^»^' SL^nino adds <M7%*B4%> 
calW h»pe,-»anabl. to U, Wings, 

produc p ptides of th invention by integration into a host genom . .g. £ cob or eac;//us, by means 
55 known genetic engineering t chniques. diaanostic agents to detect NANB hepatitis 

techniques. 
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Polyclonal and monoclonal antibodies of the invention ar useful as materials for diagnostic agents to 

^ZM^i^SSS'- - P- * ™ -» " Po^d w rth Pa rf a, 
repllmtnf of amino acids, and a detection system using monoclonal or polyclonal antibod.es to such 
oSSdel are useful as diagnostic agents of NANB hepatitis ^f.^^^^f 1° 
screen out NANB hepatitis virus from transfusion bloods or blood der.vat.ves. The polypeptides, or 
State i to such polypeptides, can be used as a material for a vaccine against NANB hepatitis v,rus. 

?T15 knol in *e art that one or more nucleotides in a DNA sequence can be replaced by otter 
nucleotides in order to produce the same protein. The present invention also concerns such nucleotide 
SSSi JSTyWd DNA sequences which code for polypeptides as described abova It .s also we 
knotTn "he art that one or more amino acids in an amino acid sequence can be replaced by equrvalent 
oZ amino adds as demonstrated by U.S. Patent No. 4,737.487 which is incorporated by reference, m 
orfe TprooSe an analog of the amL acid sequence. Any analogs of the polypeptides of the present 
^nTvoMng amino acid deletions, amino acid replacements, such as replacements by other ammo 
Tdds or by isostLs (modified amino acids that bear close structural and spatial sim.larrty to protein^ am.no 
acidsj amL acid additions, or isosteres additions can be utilized, so long as the sequences el.crt 

are shown below, however, the invention shall in no way be 

limited to those examples. 
Examples 

The 5" terminal nucleotide sequence and amino acid sequence of NANB hepatitis virus genome were 
determined in the following way: 

(1) Isolation of RNA 

dma of ,K e sample (HC-J1, HC-J6, HC-J8) from plasma of Japanese blood donor testing positive for 
hcJ OOoTanS (by Ortho HCV Ab ELISA. Ortho Diagnostic System. Tokyo), and that of the sample 
iSrSpSe^ with NANB hepatitis for infectivity and negative for HCV antibody 

™£^J££^™ THs chionde buffer (10 mM. P H 8.0) and centrifuged at 68 x i» 
rpm «sTeci Prt ate was suspended in Tns chloride buffer (50 mM P H 8 0 ) o«m » ^ 
NaCI 10 mM EDTA. 2% (w/v) sodium dodecyl sulfate (SDS), and prote.nase K 1 mg/ml. .ncubated at 60 C 
for 1 noun ften their nucleic acids were extracted by phenol/chloroform and precprtated by ethanol to 
obtain RNA. 

(2) HC-J1 and HC-J8 cDNA Synthesis 

After heating the RNA isolated from HC-J1 or HC-J8 plasma at 70'C for 1 minute, this we* , used as a 
tempi 10 unto of reverse transcriptase (cDNA Synthesis System Plus. Amersham Japan) and 20 > pmol 
roCucleotide primer (20 mer) were added and incubated at 42' C for 1 .5 hours to obta.n cDNA^ Pnmer 
m T- GATGCTTGCGGAAGCAATCA - 3') was prepared by referring to the bas.c sequence shown .n 
European Patent Application No. 88310922.5. which is relied on and incorporated herem by reference. 

(3) cDNA Was Amplified by the following Polymerase C hain Reaction (PCR) 

cDNA was amplified for 35 cycles according to Saiki's method (Science (1983) 239: 487-491) using 
Gene Amp DNA Amplifier Reagent (Perkin-Elmer.Cetus) on a DNA Thermal Cycler (Perk.n-Elmer.Cetus) 

For cDNA synthesis and for PCR for HC-J8, synthesized primers disclosed in Japanese pa en 
application 153402/90 and those based on HC-J1, HC-J4 and HC-J6 genomes disclosed .n Japanese patent 
applications 196175/91 and below were utilized. 

(4) Determination of 5' Terminal Nucleotide Sequ nee of HC-J1 and HC-J4 by Ass moling cDNA C lones 

As shown in F.gur s 2 and 3, nucleotid sequences of 5' termini of the genom s of strains HC-J1 and 
HCM wT determined by combined analysis of clones obtained from th cDNA Mibrary construct d .n 
bacteriophage Xgt10 and clones obtained by amplification of HCV specific cDNA by PCR. 
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Fiour s 2 and 3 show 5' termini of NANB hepatitis virus genome together with cleavag sit by 
re JSon endonuci ase and sequ nee of primers used, in th figures, solid lines are nucl otkfe s quences 
! iTmSTd lb, clon s from bad riophag XgttO iibrary whiie dotted lines show sequenc s determined by 

^^fnucleotidf^uence of HC-J1 spanning nt454-2109 was determined by clone .41 which was 
obtained bv inserting the cDNA synthesized with the primer #8 into Xgt10 phage vector (Amersham) 

Cther primer L (5 '- TCCCTGTTGCATAGTTCACG -3') corresponding to nt824-843 was synthes.zed 
base^rrXequence, and four Cones (.60. .61. 066 and 075) were obtained to cover the upstream 
sequence ntl 8-843. 

(5) Determination of 5 1 Terminal Nucleotide Sequence of HC-J6. 

The nucleotide sequence of the 5' terminus of strain HC-J6 was determined from analysis of clones 

^aTo^A^ of » sequence was made in the same manner as 

descntedTn t) above Sequences in the range of nt24-2551 of the RNA were determined from .consensus 
sequence of respective cfones obtained by amplification by PGR using each par of pnmers based on 
nucleotide sequence of HC-J4. 

20 nt24-826 

#32 ( 5 ' -ACTCCACCATAGATCACTCC-3 ' ) 
#122 ( 5 ' -AGGTTCCCTGTTGCATAATT-3 ' ) 

25 

Clones: C9397. C9388, C9764 
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30 nt732-1907 

#50 ( 5 ' -GCCGACCTC ATGGGGTAC AT- 3 ' ) 
#128 ( 5 ' -TCGGTCGTGCCCACTACCAC-3 ' ) 



Clones: C9316 , C9752 ,C9753 



40 



ntl847-2571 

« #149(5' -TCTGTGTGTGGCCC AGTGTA- 3 1 ) 

#146 ( 5 ' -AGTAGCATCATCCACAAGCA- 3 ' ) 
Clones: C11621 , C11624 ,C11655 



50 



In order to determine further upstream of the 5" terminus, antisense pnmer #36 (5 - ^ CT ^ T CGG- 
CTAGCAGT -3') corresponding to nt246-265. followed by dAs were added to 5' termmus of cMAumo 
terminal deoxynucl otidyl transferas . and one-sided PGR amplification was made tw.ee as d senbed 

55 ^'TdNA was amplified for 35 cycles as first stage PGR using o.igo dT primer (20-mer) |^ "J""" 
nrimer #48 (5'-GTTGATCCAAGAAAGGACCC -3') of nt188-207. followed by th s cond stage of PCR by 30 
cyde amXa«on using the first PCR product as a template, oligo dT primer (20 -mer) and anfsense 
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primer #109 (21-m r; 5'-ACCGGATCCGCAGACCACTAT-3') corresponding to nt140 to 160. Th obtained 
PCR product was subcloned to M13 phage vector. 

Nucleotid s quenc from nt1 to 23 was d termined from consensus s quence of 13 isolat d clon s 
C9577 C9579, C8681. C9587, C9590, C9591, C9595. C9606. C9609, C9615, C9616 and C9619 obtained 
above which were considered having complete 5' terminus. 

(6) Determination of nucleotide sequence of HG-J6 middle region. 

cDNA library was constructed with using Xgt10 according to the method described in (2) above from 
100ml of HC-J6 plasma as a starting materials. Primers #162 and #81 ^P*"* ^^C.onS 
referring to the basic sequence shown in the European Patent Appl.cat.on Publ.cat.on No. 318,216. Clones 
were selected by plaque hybridization. „k»„:„<^ 

Nucleotide sequence from 2552 to 8700 was determined from consensus se que nee of four obtained 
cDNA clones 02 (nt6996 to 8700). 06(nt6485 to 8700), 08(nt6OO8 to 8700) and 081 (nt2199 to 6168) as 
shown in Figure 1. Clones 081 and 08 were found to have nucleotide sequences shown in sequence hsts 3 
and 4 respectively. 

(7) Determination of 3' terminal nucleotide sequence of HC-J6 strain. 

As shown in Figure 5. the nucleotide sequence of the 3' terminus of HC-J6 genome was determined by 
analysis of clones obtained by amplification of HCV specific cDNA by PCR. 

Nucleotide sequence of HC-J6 from nt8701 to 9241 was determined from consensus sequence of three 
clones consisting of 938 nucleotides, C9760. C9234 and C9761, obtained by amplification of sample usmg 
primer #80 ( 5'-GACACCCGCTGTTTTGACTC-3') and #60 (5'-GTTCTTACTGCCCAGTTGAA-3). 

Nucleotide sequence of 3' terminus down stream from nt9242 was determ.ned in the method described 

"^Isolation of RNA from HC-J6 was made in the same manner as described in (1) above. The obtained 
RNA was added poly (A) to its 3' terminus using poly (A) polymerase and cDNA was synthesized us.ng 
oliao (dTteo as a primer, and obtained cDNA was provided to PCR as a template. 

FirstPCR product was made with using #97 <5'-AGTCAGGGCGTCCCTCATCT-3') as a sensepnmer 
and oligo (dTfco as an antisense primer. Second PCR product was made with using #90 (5- 
GCCGTTTGCGGCCGATATCT-3') corresponding to downstream sequence of #97 as a sense pnmer. and 
oligo (dTfco as an antisense primer as well as first PCR product. PCR product obtained °y two step 
Lplificatfon was smoothened on both ends by treatment ^ J'™ P 0 *"^^ 
phosphorylation of 5'terminus by T, polynucleotide kinase. The obtained product was subcloned into Hinc II 
position of M13mp19 phage vector. . , Hnn __ 

Nucleotide sequence of 3' terminus was determined from aaM ^^^^ 9 ^^^; 
C10311. C10313, C10314. C10320, C10322, C10323. C10326. C10328. C10330, C10333, C10334. C10336, 
C10337 C10345, C10346, C10347, C10349, C10350 and C10357. 

As a result, the nucleotide sequence of cDNA to HC-J6 genome RNA was determ.ned as shown in 
sequence list 2, and full sequence of genome RNA was determined as shown in sequence l.st 1 . 

(8) Det ermination of amino acid sequences. 

According to the nucleotide sequence of the genome of strain HC-J6, determination was made of 
sequence of coded region starting with ATG. As a result, HC-J6 genome was found to have a long Open 
Reading Frame coding for polypeptide precursor consisting of 3033 ammo acid residues. 

(9) Determi nation of 5' terminal nucleotide sequence of HC-J8 

As shown in Figure 6, the nucleotide sequence of 5 1 terminus of HC-J8 genome (a region) was 
determined by analysis of clones obtained by amplification of HCV specifi ^^NA by PCR. 

Single-strand d cDNA was synthesized using antis nse prim r #36 (5 -AACACTACTCGGCTAGCAGT 
3') of nt246 to 265 in the same manner as (2) abov , then it was added with dATP tail at its 3 terminus by 
terminal deoxynucleotidyl transf rase, then amplified by one-sided PCR in two stages. 

That is, in the first stage, antisense primer #48 (S'-GTTGATCCAAGAAAGGACCC^ of nt188 to 207 
was us d with sens prim r selected from non-sp cific primer #165 (5'-MGGATCCGTCGACATCGATAAT- 
ACG (A) ,7-3') and #171 (5'-AAGGATCCGTCGACATCGATAATACG(T)i 7 -3 ) to amplrfy the dA-tailed cDNA 
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by PGR for 35 cycles; and in the second stag , using the product of the first-stage PCR as a t mplate, non- 
specific prim r #166 (5' AAGGATCCGTCGACATCGAT -3') and antis nse primer #109 (21-mer; 5'-ACCG- 
GATCCGCAGACCACTAT -3') wer added to initiate PCR for 30 cycl s. The product of PCR was subcloned 

5 ^TOrteTn indfpendent clones (poly dT-tailed: C1 4951, C1 4952, C14953, C14958, C14960, C14968, 
C14971 C14972 and C14974; poly dA-tailed: C14987, C14996, C14999 and C150O0) were obtained (each 
considered having complete length of 5' terminus), and the consensus sequence of nt1-139 of the 
respective clones was determined. 

70 (10) cDNA amplification of ORF region and 3' terminus by PCR 

As shown in Figure 6, the nucleotide sequence of downstream from nt140 of HC-J8 genome was 
determined by analysis of clones obtained by amplification of HCV specific cDNA by PCR. 

Single-stranded cDNAs to HC-J8 RNA were synthesized in the same manner as (2) above using 
is antisense primers described below, then they were amplified by PCR using sense and antisense primers 
described below. Each product of PCR was subcloned to M13 phage vector, then consensus sequence of 
the respective clones of each region was determined. 

The primers for cDNA synthesis and PCR amplification, and the numbers of obtained clones are shown 
below for each region. Alphabetical symbol of each amplified region corresponds to that in Figure 6. 
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b region 
nt45-847 

Primer for cDNA synthesis: #122 ( 5 1 -AGGTTCCCTGTTGCATAATT-3 1 ) 
Primer for PGR: sense: #32A ( 5 1 -CTGTGAGGAACTACTGTCTT-3 

antisense #122 
Clones: C15221,C15222,C15223 



c region 
nt732-1354 

Primmer for cDNA synthesis:#54 ( 5 ' -ATCGCGTACGCCAGGATCAT-3 * ) 
Primer for PGR: sense: #50 (5 • -GCCGATCTCATGGGGTACAT-3 • ) 

antisense:#54 
Clones: C15256 ,C15257 . C15258 

d region 
ntl300-1879 

Primer for cDNA synthesis: #199 (5 ' -GGGGTGAAACAATACACCGG-3 
Primer for PGR: sense: #205 ( 5 ' -GGGACATGATGATCAACTGG-3 

antisense: #199 
Clones: C14221,C14222 , C14223 
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e region 
ntl833-2518 

Primer for cDNA synthesis: #146 (5 ' -AGTAGCATCATCCACAAGCA-3 ' ) 
Primer for PCR: sense: #150 ( 5 ' -ATCGTCTCGGCTAAGACGGT-3 * ) 

antisense: #146 
Clones: C11535 ,C11540,C11566 

f region 
nt2433-3451 

Primer for cDNA synthesis: #170 ( 5 ' -GCATAAGCAGTGATGGGGGC-3 ' ) 
Primer for PCR: sense: #160 (5' -CAGAACATCGTGGACGTGCA-3" ) 

antisense: #170 
Clones: C15348, C15349,C15356 



g region 
nt3404-4300 

Primer for cDNA synthesis: #225 (5 ' -TCGCATATGATGATGTCATA-3 1 ) 
Primer for PCR: sense: #238 ( 5 ' -CTACACCTCCAAGGGGTGGA-3 ' ) 

antisense: #225 
Clones: C15701,C15702,C15703 

h region 
nt4221-5015 

Primer for cDNA synthesis: #216 (5 ' -GTGGTCTAGACATACGGGCA-3 ' ) 

Primer for PCR: sense: #230 (5 ' -CCCATCACGTACTCCACATA-3 1 > 

antisense: #216 
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Clones: C15391,C15392,C15393 

i region 
nt4695-5062 

Primer for cDNA synthesis: #210 (5 ' -GCATCTATGTGTGTGAGGCC-3 " ) 
Primer for PCR: sense: #209 ( 5 ' -TTCGACTCCGTGATCGACTG-3 ' ) 

antisense: #210 
Clones: C14087 , C14088 ,C14089 

i region 
nt5021-6169 

Primer for cDNA synthesis: #162 ( 5 ' -TCCGACTCCGTCACGTAGTG-3 ' ) 
Primer for PCR: sense: #227 ( 5 ' -GTTCTGGGAAGCGGTCTTTA-3 ' ) 

antisense: #162 
Clones: C15421 , C15422 , C15423 

k region 
nt6027-6889 

Primer for cDNA synthesis: #232 (5 ' -GATGGGTCTGTTAGCATGGA-3 " ) 
Primer for PCR: sense: #242 ( 5 ' -TTGGTAGTGGGAGTCATCTG-3 ' ) 

antisense: #232 
Clones: C15733,C15734,C15735 

1 region 
nt6834-7735 

Primer for cDNA synthesis #239 ( 5 ' -ATCGGTAACTTCTCCTCTTC-3 ' ) 
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Primer for PCR: sense: #241 ( 5 ' -CCTTGCGATCCTGAACCTGA-3 ' ) 

antisense: #239 
Clones: C15798,C15799,C15800 



to m region 

nt7656-8630 

Primer for cDNA synthesis: #222 ( 5 ' -GACCAGGTCGTCTCCACACA-3 ' ) 
Primer for PCR: sense: #229 (5 1 -GTCGTGTGCTGCTCCATGTC-3 ' ) 

antisense: #222 
20 Clones: C15376 ,C15378 , C15381 



n region 
nt8325-9511 

Primer for cDNA synthesis: #165 
30 Primer for PCR: sense: #80 { 5 1 -GACACCCGCTGTTTTGACTC-3 ' ) 

non-specific: #165 
Clones: C15270 , C15271 , C15272 



From the analysis described above, full nucleotide sequence of cDNA to HC-J8 was determined as 
shown in sequence list 7, then full nucleotide sequence of HC-J8 genome RNA as shown in sequence list 6. 
Two amino acid sequences shown in sequence lists 8 and 9 represent those coded for by HC-J8 genome. 

40 Utilizing known immunological techniques, it is possible to determine epitopes (e.g., from the core 
region, etc.) from the polypeptides of sequence lists 5, 8 and 9. Determination of such epitopes of the 
NANB hepatitis virus opens access to chemical synthesis of the peptide, manufacturing of the peptide by 
genetic engineering techniques, synthesis of the polynucleotides, manufacturing of the antibody, manufac- 
turing of NANB hepatitis diagnostic reagents, and development of products such as NANB hepatitis 

45 vaccines. 

According to the well-known method described by Merrifield, NAMB peptides can be synthesized. 
Furthermore, the polynucleotides in sequence lists 2-4 and 7 can be used to express polypeptides in host 
cells such as Escherichia coli by means of genetic engineering technique. 

A detection system for antibody against NANB hepatitis virus can be developed using polyvinyl 

so microliter plates and the sandwich method. For example, 50ul of 5 ug/ml concentration of a NANB peptide 
can be dispensed in each well of the microtiter plates and incubated overnight at room temperature for 
consolidation. The microplate wells can be washed five times with physiological saline containing 0.05% 
Tween 20. For overcoating, 100 ul of NaCI buffer containing 30% (v/v) of calf s rum and 0.05% Tween 20 
(CS buffer) can be dispensed in each w II and discarded aft r incubation for 30 minutes at room 

55 temperature. 

For determination of NANB antibodi s in samples, in the primary reaction, 50ul of the CS buffer 
containing 30% calf s rum and 10 ul of a sample can be dispensed in each microplate well and incubated 
on a microplate vibrator for one hour at room temperatur . After completion of the reaction, microplate wells 
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can be wash d fiv times in the same way as pr viously described. 

In the secondary reaction, as label d antibody 1 ng of horseradish peroxidase label d anti-human IgG 
mouse monoclonal antibodies (Fab 1 fragment: 22G, Institute of Immunology Co., Ltd., Tokyo, Japan) 
dissolved in 50 ul of calf s rum can b dispensed in each microplat well, and incubated on a microplate 

5 vibrator for one hour at room temperature. Wells can be washed five times in the same way. After addition 
of hydrogen peroxide (as substrate) and 50 u\ of O-phenylendiamine solution (as color developer) in each 
well, and after incubation for 30 minutes at room temperature, 50 ul of 4M sulphuric acid can be dispensed 
in each well to stop further color development and for reading absorbance at 492 nm. 

The cut-off level of this assay system can be set by measuring a number of donor samples with normal 

10 serum ALT (alanine aminotransferase) value of 34 Karmen unit or below and which tested negative for anti- 
HCV. 

The present invention makes possible detection of NANB hepatitis virus infection which could not be 
detected by conventional determination methods, and provide NANB hepatitis detection kits capable of 
highly specific and sensitive detection at an early phase of infection. 
is These features allow accurate diagnosis of patients at an early stage of the disease and also help to 
remove at higher rate NANB hepatitis virus carrier bloods through screening test of donor bloods. 

Polypeptides and their antibodies under this invention can be utilized for manufacture of vaccines and 
immunological pharmaceuticals, and structural gene of NANB hepatitis virus provides indispensable tools 
for detection of polypeptide antigens and antibodies. 
20 Antigen-antibody complexes can be detected by methods known in this art. Specific monoclonal and 
polyclonal antibodies can be obtained by immunizing such animals as mice, guinea pigs, rabbits, goats and 
horses with NANB peptides (e.g., bearing NANB hepatitis antigenic epitope). 

The present invention is based on studies on isolated virus genome of NANB hepatitis virus named HC- 
J6 and HC-J8. and is completed by clarification of the full sequence of the nucleotides. The invention 
25 makes possible highly specific detection of NANB hepatitis virus and provision of polypeptide, polyclonal 
antibody and monoclonal antibody to prepare the test system. 

Further variations and modifications of the invention will become apparent to those skilled in the art 
from the foregoing and are intended to be encompassed by the claims appended hereto. 

Japanese Priority Applications 287402/91 filed August 9, 1991 and 360441/91 filed on December 5, 
30 1991 are relied on and incorporated by reference. U.S. patent applications serial no. 07/540,604 (filed June 
19, 1990), 07/653,090 (filed February 8, 1991) , and 07/712,875 (filed June 11, 1991) are incorporated by 
reference in their entirety. 

Sequence list 

35 

Sequence list 1 : 
Sequence list 2: 
Sequence list 3: 
Sequence list 4: 
40 Sequence list 5: 
Sequence list 6: 
Sequence list 7: 
Sequence list 8: 
Sequence list 9: 

45 

Claims 

1. Recombinant RNA of non-A, non-B hepatitis virus, strain HC-J6, comprising the nucleotide sequence of 
sequence list 1. 

50 

2. Recombinant cDNA of non-A, non-B hepatitis virus, strain HC-J6, comprising the nucleotide sequence 
of sequence list 2. 

3. cDNA clon J6-081 comprising the nucleotide sequence of sequence list 3. 

55 

4. cDNA clon J6-08 comprising the nucleotid sequence of sequenc list 4. 



whole nucleotides of HC-J6 genome RNA 

N-9589 whole nucleotides of cDNA to HC-J6 genome RNA 

J6-081 nucleotides of clone J6-081 

J6-08 nucleotides of clone J6-08 

P-J6-3033 whole amino acids of ORF of HC-J6 genome 

whole nucleotides of HC-J8 genome RNA 

whole nucleotides of cDNA to HC-J8 genome RNA 

whole amino acids of a variation of ORF of HC-J8 genome 

whole amino acids of a variation of ORF of HC-J8 genome 
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5. Amino acid sequ nee corr spending to recombinant cDNA of non-A, non-B h patitis virus, strain HC- 
J6, comprising th amino acid sequence of sequenc list 5. 

6. Recombinant RNA of non-A, non-B hepatitis virus, strain HC-J8, comprising the nucleotide sequenc of 
sequence list 6. 

7. Recombinant cDNA of non-A, non-B hepatitis virus, strain HC-J8, comprising the nucleotide sequence 
of sequence list 7. 

a Amino acid sequence corresponding to recombinant cDNA of non-A, non-B hepatitis virus, strain HC- 
J8, comprising the amino acid sequence of sequence list 8. 

9. Amino acid sequence corresponding to recombinant cDNA of non-A, non-B hepatitis virus, strain HC- 
J8, comprising the amino acid sequence of sequence list 9. 

10. A non-A, non-B hepatitis diagnostic test kit for analyzing samples for the presence of antibodies 
directed against a non-A, non-B hepatitis antigen, comprising an antigen attached to a solid substrate 
and labeled anti-human immunoglobulin; wherein said antigen is an antigen selected from the antigens 
contained in sequence lists 5, 8 or 9. 

11. A method of detecting antibodies directed against a non-A, non-B hepatitis antigen in a sample, said 
method comprising: 

(a) reacting said sample with an antigen selected from the antigens contained in sequence lists 5, 8 
or 9 to form antigen-antibody complexes; and 

(b) detecting said antigen-antibody complexes. 

1£ A non-A, non-B hepatitis specific monoclonal or polyclonal antibody reactive with an antigen, said 
antigen is an antigen selected from the antigens contained in sequence lists 5, 8 or 9. 

1a A method of detecting non-A, non-B hepatitis antigen in a sample, said method comprising: 

(a) reacting said sample with the non-A, non-B hepatitis monoclonal or polyclonal antibody according 
to claim 12 to form antigen-antibody complexes; and 

(b) detecting said antigen-antibody complexes. 
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Sequence ID No.l 

Sequence Length: 9,589 

Sequence Type: nucleic acid 

Strandedness: single 

Topology: linear 

Molecule Type: genomic RNA 

Method for Determination of Feature: E 



ACCCGCCCCU AAUAGGGGCG ACACUCCGCC AUGAACCACU CCCCUGUGAG GAACUACUGU 60 

CUUCACGCAG AAAGCGUCUA GCCAUGGCGU UAGUAUGAGU GUCGUACAGC CUCCAGGCCC 120 

CCCCCUCCCG GGAGAGCCAU AGUGGUCUGC GGAACCGGUG AGUACACCGG AAUUGCCGGG 180 

AAGACUGGGU CCUUUCUUGG AUAAACCCAC UCUAUGCCCG GUCAUUUGGG CGUGCCCCCG 240 

CAAGACUGCU AGCCGAGUAG CGUUGGGUUG CGAAAGGCCU UGUGGUACUG CCUGAUAGGG 300 

UGCUUGCGAG UGCCCCGGGA GGUCUCGUAG ACCGUGCACC AUGAGCACAA AUCCUAAACC 360 

UCAAAGAAAA ACCAAAAGAA ACACCAACCG UCGCCCACAA GACGUUAAGU UUCCGGGCGG 420 

CGGCCAGAUC GUUGGCGGAG UAUACUUGUU GCCGCGCAGG GGCCCCAGGU UGGGUGUGCG 480 

CGCGACAAGG AAGACUUCGG AGCGGUCCCA GCCACGUGGA AGGCGCCAGC CCAUCCCUAA 540 

GGAUCGGCGC UCCACUGGCA AAUCCUGGGG AAAACCAGGA UACCCCUGGC CCCUAUACGG 600 

GAAUGAGGGA CUCGGCUGGG CAGGAUGGCU CCUGUCCCCC CGAGGUUCCC GUCCCUCUUG 660 

GGGCCCCAAU GACCCCCGGC AUAGGUCCCG CAACGUGGGU AAGGUCAUCG ADACCCUAAC 720 

GUGCGGCUUU GCCGACCUCA UGGGGUACAU CCCUGUCGUA GGCGCCCCGC UCGGCGGCGU 780 

CGCCAGAGCU CUCGCGCAUG GCGUGAGAGU CCUGGAGGAC GGGGUUAAUU UUGCAACAGG 840 

GAACUUACCC G6UU6CUCCU UUUCUAUCUU CUUGCUGGCC CUGCUGUCCU GCAUCACCAC 900 

CCCGGUCUCC GCUGCCGAAG UGAAGAACAU CAGUACCGGC UACAUGGUGA CCAACGACUG 960 

CACCAAUGAU AGCAUUACCU GGCAACUCCA GGCUGCUGUC CUCCACGUCC CCGGGUGCGU 1020 

CCCGUGCGAG AAAGUGGGGA AUACAUCUCG GUGCUGGAUA CCGGUCUCAC CGAAUGUGGC 1080 

CGUGCAGCAG CCCGGCGCCC UCACGCAGGG CUUACGGACG CACAUUGACA UGGUUGUGAU 1140 
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GUCCGCCACG 


CUCUGCUCCG 

vUv U UVs vlvf Vf VI 


CUCUUUACGU 


AGCCCAGAUG 


UUCAUUGUCU 

\J \J \/ 1» V \J Vf \r \r w 


CGCCACAGCA 


CAUCUACCCU 


GGUACCAUCA 


CUGGACACCG 

vf uuvin vn wu 


GCCCACGGCU 


ACCAUGAUCC 

nv/vn uunv w 


UGGCGUACGC 

V wmVh UltvVI Vf 


PAUUGGCGGG 


GCUCAUUGGG 

Vlvf \IvfWJ UUUU 


GCGUCAUGUU 

vix/viuvfrii/vjvvv 


GIIGGGPAAAA 


GUCGUUGUCA 


UUCUUUUGCU 

1/ Uv UUUU ViVf VI 


PGIIIIGfiGGGll 


UPIIAPCGCGC 


AUAACGCCAG 


PAGGPAGAAA 

vnUUvnUnnn 


AUCCAGCUCA 


UCAACACCAA 

v vf ri r% v/ n v v n n 


GAAPIIGPAAII 


GAPIIPUIIIJGC 


ACACCGGCUU 

nunvvUUvvU 


paapiipgiipa 


ggaiigiipppg 


AACGCAUGUC 


ftfiGAIIGGGGP 

UUunUUUuUU 


GPPIIIIAPAAII 


AIIGAGGAPAA 


iiiinriiftftrflr 

UUUlsUUUV/Hl/ 


IIA PPPAPP AA 


GAPAGIJG1IGG 


AGIIGIIAPIIGII 


IIIIPAPPPPPA 

U U \f rt V/ V/ 1/ V/ V/ n 


GPPPAGIIAGH 


papiiiiapapg 


IIGGGGGGAGA 


AIJGAGAPAGA 


GPAGGGGIIPA 


I1GGIIIIPGGP1I 


GPAPGIIGGAII 




IIGPPftPAIIlIA 


GAGPIIGAPIIII 


ftGAPiiftiiiiiiii 

UUHV/UuUUUU 


AGGAAGPAIIP 

nUUHMUvn III/ 


PIIGAIIAPPAP 

v UUnUMvf Vrttf 


V/ H 1/ V/tt HU u 


IIGPPIIGAIIPG 


APIIAPPPPIIA 

rt 1/ U rtvf 1/ 1/ U U n 


piiaiiappaiip 


IIIIPAAAAIIAA 


GGAIIGIIAIIGII 

UUH UU UK UU U 


giigpaaiiiiiip 


APIIPGIIGGGG 

rtvf Uvf U UUvlvlU 


AilPftllllftPAA 

t\ U vf u U U u 1/ rtn 


iippiiiniGPiin 


PAPIIPPAPPA 

V n w U V/v n V/ Ir n 


PGGAGIIGGGP 


CGCCUUGUCG 


ACuGGUCUUC 


UCCACCuCCA 


UGGCCUAUCA 


CCUGCUCUCA 


CAAAAUACAU 


CCUGCUCUUA 


GCGGACGCCA 


GGGUUUGCGC 


GGCCGAAGCA 


GCACUAGAGA 


AGUUGGliCGU 


UGGCUUCCUA 


UACUUUGUCA 


UCUUUUUCGU 


CCCCUUGGCU 


ACUUAUUCCC 


UCACUGGCCU 


GCCCCAACAG 


GCUUAUGCUU 


AUGACGCAUC 


GGUACUGAUC 


ACUCUCUUUA 


CACUCACCCC 
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GGGGGACCUC UGCGGUGGGG UGAUGCUUGC 1200 
CCACUGGUUU GUGCAAGACU GCAAUUGCUC 1260 
CAUGGCGUGG GACAUGAUGA UGAACUGGUC 1320 
GAUGCGCGUC CCCGAGGUCA UCAUAGACAU 1380 
CGGCUUAGCC UACUUCUCUA UGCAGGGAGC 1440 
GGCCGCCGGG GUGGACGCGC AAACCCAUAC 1500 
GACCCUCACC GGCAUGUUCU CCCUUGGUGC 1560 
UGGCAGUUGG CACAUCAACC GCACCGCCCU 1620 
CCUCGCGUCA CUGUUCUACA CCCACAGCUU 1680 
CGCCUGCCGC AGUAUCGAGG CCUUUCGGGU 1740 
UGUCACCAAU CCAGAGGAUA UGAGACCGUA 1800 
UGUAGUCUCC GCGAGCUCUG UGUGUGGCCC 1860 
AGUGGGUACG ACCGAUAGAC UUGGAGCGCC 1920 
UGUCUUCCUA UUGAACAGCA CUCGACCACC 1980 
GAACUCCACU GGCUACACCA AGACUUGCGG 2040 
CAAUGCCAGC AUGGACUUGU UGUGCCCCAC 2100 
CUACAUCAAA UGUGGCUCUG GGCCCliGGCU 2160 
CAGGCUCUGG CAUUACCCCU GCACAGUUAA 2220 
GGGGGGGGUC GAGCACAGGC UCACGGCUGC 2280 
CUUGGAGGAC AGAGACAGAA GUCAACUGUC 2340 
CAUUUUACCU UGCACUUACU CGGACCUGCC 2400 
CCAAAACAUC GUGGACGUGC AAUUCAUGUA 2460 
CGUCCGAUGG GAGUGGGUAG UACUCUUAUU 2520 
CUGCUUAUGG AUGCUCAUCU UGUUGGGCCA 2580 
CUUGCACGCU GCGAGCGCAG CUAGCUGCAA 2640 
GGCUGCUUGG UACAUCAAGG GUCGGGUAGU 2700 
AUGGUCCUUU GGCCUACUGC UCCUAGCAUU 2760 
UGUACAUGGl) CAGAUAGGAG CAGCUCUGUU 2820 
CGGGUAUAAG ACCCUUCUCA GCCGGUUUCU 2880 
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GUGGUGGUUG UGCUAUCUUC UGACCCUGGC GGAAGCUAUG GUCCAGGAGU GGGCACCACC 2940 
UAUGCAGGUG CGCGGUGGCC GUGAUGGGAU CAUAUGGGCC GUCGCCAUAU UCUGCCCGGG 3000 
UGUGGUGUUU GACAUAACCA AGUGGCUCUU GGCGGUGCUU GGGCCUGCUU AUCUCCUAAA 3060 
AGGUGCUUUG ACGCGUGUGC CGUACUUCGU CAGGGCUCAC GCUCUACUAA GGAUGUGCAC .3120 
CAUGGUAAGG CAUCUCGCGG GGGGUAGGUA CGUCCAGAUG GUGCUACUAG CCCUUGGCAG 3180 
GUGGACUGGC ACUUACAUCU AUGACCACCU CACCCCUAUG UCGGAUUGGG CUGCUAAUGG 3240 
CCUGCGGGAC UUGGCGGUCG CCGUGGAGCC UAUCAUCUUC AGUCCGAUGG AGAAAAAAGU 3300 
CAUCGUCUGG GGAGCGGAGA CAGCUGCUUG CGGGGAUAUC UUACACGGAC UUCCCGUGUC 3360 
CGCCCGACUU GGCCGGGAGG UCCUCCUUGG CCCAGCUGAU GGCUAUACCU CCAAGGGGUG 3420 
GAGUCUUCUC GCCCCCAUCA CUGCUUAUGC CCAGCAGACA CGCGGCCUUU UGGGCACCAU 3480 
AGUGGUGAGC AUGACGGGGC GCGACAAGAC AGAACAGGCC GGGGAGAUUC AGGUCCUGUC 3540 
CACGGUCACU CAGUCCUUCC UCGGAACAAC CAUCUCGGGG GUCUUAUGGA CUGUCUACCA 3600 
UGGAGCUGGC AACAAGACUC UAGCCGGCUC ACGGGGUCCG GUCACACAGA UGUACUCCAG 3660 
UGCUGAGGGG GACUUAGUGG GGUGGCCCAG CCCCCCCGGG ACCAAAUCUU UGGAGCCGUG 3720 
CACGUGUGGA GCGGUCGACC UAUACCUGGU CACGCGAAAC GCUGAUGUCA UCCCGGCUCG 3780 
AAGACGCGGG GACAAGCGAG GAGCGCUACU CUCCCCGAGA CCUCUUUCCA CCUUGAAGGG 3840 
GUCCUCGGGG GGCCCGGUGC UCUGCCCCAG AGGCCACGCU GUCGGGGUCU UCCGGGCAGC 3900 
CGUGUGCUCC CGGGGCGUGG CCAAGUCCAU AGAUUUUAUC CCCGUUGAGA CACUUGACAU 3960 
CGUCACUCGG UCCCCCACCU UUAGUGACAA CAGCACACCA CCUGCUGUGC CCCAAACUUA 4020 
UCAGGUCGGG UACUUACAUG CCCCGACUGG UAGUGGAAAG AGCACCAAAG UCCCUGUCGC 4080 
GUAUGCCGCU CAGGGGliACA AAGUGCUAGU GCUUAAUCCC UCGGUGGCUG CCACCCUGGG 4140 
GUUUGGGGCG UACUUGUCCA AGGCACAUGG CAUCAAUCCC AACAUUAGGA CUGGGGUCAG 4200 
GACUGUGACG ACCGGGGCGC CCAUCACGUA CUCCACAUAU GGCAAAUUCC UCGCCGAUGG 4260 
GGGCUGCGCA GGCGGCGCCU AUGACAUCAU CAUAUGCGAU GAAUGCCAUG CCGUGGACUC 4320 
UACCACCAUU CUCGGCAUCG GAACAGUCCU CGAUCAAGCA GAGACAGCCG GGGUCAGGCU 4380 
AACUGUACUG GCUACGGCUA CGCCCCCCGG GUCAGUGACA ACCCCCCACC CCAACAUAGA 4440 
GGAGGUGGCC CUCGGGCAGG AGGGUGAGAU CCCCUUCUAU GGGAGGGCGA UUCCCCUGUC 4500 
AUACAUCAAG GGAGGAAGAC ACUUGAUCUU CUGCCACUCA AAGAAAAAGU GUGACGAGCU 4560 
CGCGGCGGCC CUUCGGGGUA UGGGCUUGAA CGCAGUGGCA UACUACAGAG GGCUGGACGU 4620 
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CUCCGUAAUA CCAACUCAGG GAGACGUAGU GGUCGUCGCC ACCGACGCCC UCAUGACGGG 4680 
GUUUACUGGA GACUUUGACU CCGUGAUCGA CUGCAACGUA GCGGUCACUC AAGUUGUAGA 4740 
CUUCAGCUUG GACCCCACAU UCACCAUAAC CACACAGACU GUCCCUCAAG ACGCUGUCUC 4800 
ACGUAGCCAG CGCCGGGGCC GCACGGGCAG GGGAAGACUG GGUAUUUAUA GGUAUGUUUC 4860 
CACUGGUGAG CGAGCCUCAG GAAUGUUUGA CAGUGUAGUG CUCUGCGAGU GCUACGAUGC 4920 
AGGGGCCGCA UGGUAUGAGC UCACACCAGC GGAGACCACC GUCAGGCUCA GAGCAUAUUU 4980 
CAACACACCU GGUUUGCCUG UGUGCCAAGA CCAUCUUGAG UUUUGGGAGG CAGUUUUCAC 5040 
CGGCCUCACA C AC A U AG AUG CCCACUUCCU UUCCCAAACA AAGCAAUCGG GGGAAAAUUU 5100 
CGCAUACUUA ACAGCCUACC AGGCUACAGU GUGCGCUAGG GCCAAAGCCC CCCCCCCGUC 5160 
CUGGGACGUC AUGUGGAAGU GUUUGACUCG ACUCAAGCCC ACACUCGUGG GCCCCACACC 5220 
UCUCCUGUAC CGCUUGGGCU CUGUUACCAA CGAGGUCACC CUCACGCAUC CUGUGACGAA 5280 
AUACAUCGCC ACCUGCAUGC AAGCCGACCU UGAGGUCAUG ACCAGCACGU GGGUCUUAGC 5340 
UGGGGGGGUC UUGGCGGCCG UCGCCGCGUA CUGCCUGGCG ACCGGGUGUG UUUGCAUCAU 5400 
CGGCCGCUUG CACGUUAACC AGCGAGCCGU CGUUGCACCG GACAAGGAGG UCCUCUAUGA 5460 
GGCUUUUGAU GAGAUGGAGG AAUGUGCCUC UAGAGCGGCU CUCAUUGAAG AGGGGCAGCG 5520 
GAUAGCCGAG AUGCUGAAGU CCAAGAUCCA AGGCUUAUUG CAGCAAGCUU CCAAACAAGC 5580 
UCAAGACAUA CAACCCGCUG UGCAGGCUUC UUGGCCCAAG GUAGAGCAAU UCUGGGCCAA 5640 
ACACAUGUGG AACUUCAUCA GCGGCAUUCA AUACCUCGCA GGACUAUCAA CACUGCCAGG 5700 
GAACCCUGCU GUAGCUUCCA UGAUGGCAUU CAGUGCCGCC CUCACCAGUC CGUUGUCAAC 5760 
i.JAGCACCACU AUCCUUCUCA ACAUUUUGGG GGGCUGGCUA GCAUCCCAAA UUGCGCCUCC 5820 
CGCGGGGGCU ACCGGCUUCG UCGUCAGUGG CCUGGUGGGG GCUGCCGUAG GCAGCAUAGG 5880 
CUUGGGUAAG GUGCUGGUGG ACAUCCUGGC AGGGUAUGGU GCGGGCAUUU CGGGGGCUCU 5940 
CGUCGCAUUC AAGAUCAUGU CUGGCGAGAA GCCCUCCAUG GAGGAUGUUG UCAACCUGCU 6000 
GCCUGGAAUU CUGUCUCCGG GUGCCCUGGU GGUGGGAGUC AUCUGCGCGG CCAUCCUGCG 6060 
CCGACACGUG GGACCGGGGG AAGGCGCUGU CCAAUGGAUG AAUAGGCUCA UUGCCUUUGC 6120 
UUCCAGAGGA AACCACGUCG CCCCCACCCA CUACGUGACG GAGUCGGAUG CGUCGCAGCG 6180 
UGUGACCCAA CUACUUGGCU CCCUUACCAU AACCAGCCUG CUCAGGAGAC UCCACAACUG 6240 
GAUUACUGAA GACUGCCCCA UCCCAUGCAG CGGCUCGUGG CUCCGCGAUG UGUGGGAUUG 6300 
GGUUUGCACC AUCCUAACAG ACUUUAAAAA CUGGCUGACC UCCAAAUUGU UCCCAAAGAU 6360 
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GCCUGGUCUC 


CCCUUUAUCU 


CUUGUCAAAA 


GGGGUACAAG 


GGCGUGUGGG 


CUGGCACUGG 


6420 


UAUCAUGACC 


ACACGGUGUC 


CUUGCGGCGC 


CAAUAUCUCU 


GGCAAUGUCC 


GCCUGGGCUC 


6480 


CAUGAGAAUU 


ACGGGGCCCA 


AAACCUGCAl) 


GAAUAUCUGG 


CAGGGGACCU 


UUCCCAUCAA 


6540 


UUGUUACACG 


GAGGGCCAGl) 


GCGUGCCGAA 


ACCCGCACCA 


AACUUUAAGA 


UCGCCAUCUG 


6600 


GAGGGUGGCG 


GCCUCAGAGU 


ACGCGGAGGU 


GACGCAGCAC 


GGGUCAUACC 


ACUACAUAAC 


6660 


AGGACUUACC 


ACUGAUAACU 


UGAAAGUUCC 


UUGCCAACUA 


CCUUCUCCAG 


AGUUCUUUUC 


6720 


CUGGGUGGAC 


GGAGUGCAGA 


UCCAUAGGUU 


UGCCCCCAUA 


CCGAAGCCGU 


UUUUUCGGGA 


6780 


UGAGGUCUCG 


UUCUGCGUUG 


GGCUUAAUUC 


AUUUGUCGUC 


GGGUCUCAGC 


UCCCUUGCGA 


6840 


UCCUGAACCU 


GACACAGACG 


UAUUGACGUC 


CAUGCUAACA 


GACCCAUCCC 


AUAUCACGGC 


6900 


GGAGACUGCA 


GCGCGGCGUU 


UGGCACGGGG 


GUCACCCCCG 


UCCGAGGCAA 


GCUCCUCAGC 


6960 


GAGCCAGCUA 

VII Iwv v I lVf v vll 


UCGGCACCAU 


CGCUGCGAGC 


CACCUGCACC 


ACCCACGGCA 


AGGCCUAUGA 


7020 


UGUGGACAUG 


GUGGAUGCCA 


ACCUGUUCAU 


GGGGGGCGAU 


GUGACCCGGA 


UAGAGUCUGA 


7080 


GUCCAAAGUG 


GUCGUUCUGG 

VI v V? VJI v V " VF 


ACUCUCUCGA 


CCCAAUGGUC 


GAAGAAAGGA 


GCGACCUUGA 


7140 


GCCUUCGAUA 


CCAUCGGAAU 


AUAUGCUCCC 


CAAGAAGAGA 


UUCCCACCAG 


CCUUACCGGC 


7200 


UUGGGCACGG 


CCUGAUUACA 


ACCCACCGCU 


UGUGGAAUCG 


UGGAAGAGGC 


CAGAUUACCA 


7260 


ACCGGCCACU 


GUUGCGGGCU 


GCGCUCUCCC 


CCCCCCUAAG 


AAAACCCCGA 


CGCCUCCCCC 


7320 


AAGGAGACGC 


CGGACAGUGG 


GUCUGAGUGA 


GAGCUCCAUA 


GCAGAUGCCC 


UACAACAGCU 


7380 


GGCCAUCAAG 


UCCUUUGGCC 


AGCCCCCCCC 


AAGCGGCGAU 


UCAGGCCUUU 


CCACGGGGGC 


7440 


GGACGCAGCC 


GAUUCCGGCA 


GUCGGACGCC 


CCCCGAUGAG 


UUGGCCCUUU 


CGGAGACAGG 


7500 


UUCCAUCUCC 


UCCAUGCCCC 


CUCUCGAGGG 


GGAGCCUGGA 


GAUCCAGACU 


UGGAGCCUGA 


7560 


GCAGGUAGAG 


CUUCAACCUC 


CCCCCCAGGG 


GGGGGUGGUA 


ACCCCCGGCU 


CAGGCUCGGG 


7620 


UUV/UUUUUUU 




AfifiAGGAfGA 


VrUvvUUV/U UU 


UGCUGCUCCA 


UGUCAUACUC 

uvl vvnU Civ/ v v 


7680 


CUGGACCGGG 


GCUCUAAUAA 


CUCCUUGUAG 


CCCCGAAGAG 


GAAAAGUUGC 


CAAUUGGCCC 


7740 


CUUGAGCAAC 


UCCCUGUUGC 


GAUAUCACAA 


CAAGGUGUAC 


UGUACCACAD 


CAAAGAGCGC 


7800 


CUCAUUAAGG 


GCUAAAAAGG 


UAACUUUUGA 


UAGGAUGCAA 


GCGCUCGACG 


CUCAUUAUGA 


7860 


CUCAGUCUUG 


AAGGACAUUA 


AGCUAGCGGC 


CUCCAAGGUC 


ACCGCAAGGC 


UUCUCACUUU 


7920 


AGAGGAGGCC 


UGCCAGUUAA 


CUCCACCCCA 


CUCUGCAAGA 


UCCAAGUAUG 


GGUUUGGGGC 


7980 


UAAGGAGGUC 


CGCAGCUUGU 


CCGGGAGAGC 


CGUUAACCAC 


AUCAAGUCCG 


UGUGGAAGGA 


8040 


CCUCCUGGAA 


GACACACAAA 


CACCAAUUCC 


UACAACCAUC 


AUGGCCAAAA 


AUGAGGUGUU 


8100 
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AIIAAAMAAA A 

bubbbUbGAC 


AAA A AAA A AA 

CCCAbbAAbb 


AAAA1IAAAAA AAAAAAI1AAA 
bbbbUAAbAA AbbAbbUbbb 


aiiii A IIAAIIIIII 

bUUAUbbUuU 


AAAAIIAAAAM 
AbbbUbAbbU 


Q 1CA 

o lot) 


AAAAAIIAA A A 

CGGCGUCAGG 


AMANAAAA AA 

bUbUbbbAbA 


AAAIIAAAAAII HIIAHAA IIAIIA 
AAAUubbbbU UUAUbAUAUb 


AAAAA A A AAA 

AbAbAAAAub 


IIIIAAIIAAAAA 
UUbbUbAbbb 


coon 


AANAA IIAAAA 

GGUGAUGGGG 


AAIINAIIIIA 1IA 


AAIIIIAAAAIIA AIIAAAAAAAII 

bAUUbbAbUA bUbbbbbbbu 


AAAAAAAIIAA 
bAbbbbbUbb 


AAIIIIIIAIIAIIII 

AbUUUbubUU 


OOOA 


AAA AAA A 1IA A 

GAAGGCAUGG 


AAAA A AAA A A 

GCGGAAAAbA 


A AAAAAAIIA II AAAIIIIHIIIIAA 

AAbAbbbUAU bbbUUUUUUb 


IIAIIAA IIAAAA 
UAUbAUAbbb 


AA tIAAIIIIIIAA 
bAUbbUUUbA 


0 7 A A 


amaa A AAAIIA 

OUbAAbbbUb 


A AIIAAA AA A A 

AbUbAbAbAb 


AAAIIAAAAAA HAAAAAAIIAA 
AbAUbAubAb UbAbbAbUbb 


AIIAIIA IIAAAA. 
AUAUA Ubuuu 


AIHIAIIMAAIIII 
bUUuUUbbuU 




aaaaa aaaa a 
bbbbbAuliAb 


AAAA A A A A HA 

bbbbAbAbUu 


AAAIIAAAAIIA AAIIAAAIIAAA 
bbAUAbAbub AbUbAbUbAu 


A AAAIIIHIAAft 


MftAAAfiAAAA 
UuuuAuuubb 




A A IIAIIMAA A A 

CAUbUUbAAb 


AAAA A AAAAA 

AbbAAbbbbb 


AAAAAIIAAAA AIIAAAAAAAII 
AbAbbUbbbb bUAbAbbbbU 


IIAAAAAAAAA 
UbbbbbbbbA 


AAAAAAIIAAII 
ububbuUubll 


fl tOA 


IIAAAA AIIAAA 

UAbbAbUAbb 


AHA/ 1 /*/* A AAA 

AUbbbbAAbA 


AAAIIAAAAIIA All A IIAIIAA A A 
bbAUbAbAUb bUAUbUbAAA 


AAAIHIAAAAA 
ubbUUAubiiu 


AAIIAIIA AAAA 

bbUUUAAUUb 


OOoU 


IIAA A AAAA II A 

UbbAbbbAUA 


A IMIAAAAAAA 

AUUbbbbbbA 


A A A IIAAIIAAII AIIAAAAAAAII 

bAAUbbllbbU AUbbbbbbAU 


AAAIIIIAAIIIIA 
bAbUUbbUUb 


IIAAIIAIIAAAA 
UbAUbubAuA 


ob4u 


AA AAA AAAAA 

GAGCCAGGGG 


AAAA A/*/* A AA 

AbbbAbbAub 


AAAAAAAAAA AAIIAAAAAAA 

AbbAubbbAA bbUbAbAbbb 


IIIIAAAAA AAA 
UUUAbbbAbb 


AIIAIIAAAAAA 
bUAUbAbbAb 


ft 7AA 


A II A 1 IIIAIIAA A 

GUAUUCUGCC 


AAIIAAIIAA MA 

bCUCbUbtiUb 


AAAAAAAAAA AAAAAAAMAII 

AbbbbbbbAu AbbbbAAUAU 


AAAAIIAA AAA 

bAbbUbbAbb 


IIAA II A AAA IIA 

UbAUAAbAUb 


D 7CA 


II1IAAIIAAIIA A 

UUGCUCCUbA 


A A IIAIIAIIA HA 

AAUuUbUbUb 


IIAAAAIIIIAAA AAAAAAAAAA 

UbbbbUUbbb bbbAbAAbbb 


AAAAAAA AA II 
bubbbbAbAU 


AAIIA AAIIAAA 

AbUAbbUbAb 


0 OA 


AA AA A AAAAII 

CAGAGACCCU 


AAAA AMAAA A 

ACbAbUbbAA 


IIAAAAAAAAA IIAAAMAAAAA 

Ubbbbbbbbb UbbbUbbbAA 


A AA AIIIIA AA A 

AbAbUUAbAb 


AAMAAAAIIAII 

AbUbbbbubU 


OOOA 


AAA 11IIAA IIAA 

CAAUUCAUGG 


AIIAAAA A AAA 

GUAbbAAAbA 


IIAA IIAAA AHA AAAAAAAAAA 

UbAUbbAbuA bbbbbbAAbb 


A MA IIAAAAIIA 

AUAUbbbbUb 


AAAIIAAIIAAII 

bbAUbbubbU 


OOA A 


AA IIAAAA AA A 

GAUGACACAG 


IIIIAIIIIA IIAA A 

UUbUUbUbbA 


IIIIAIIA A IIAAA AAAAAAIIAAII 

UUbUbAUbbb bbAAbAUAbll 


AIIAAA AAAAA 

bubuAbbAbA 


AAAIIAA A AIIII 

AbbubAAbUU 


CAAA 


UGAGAUGUAC 


GGAGCGGUGU 


ACUCCGUGAG UCCCUUGGAC 


CUCCCAGCCA 


AJAAUUGAAAG 


9060 


GUUACACGGG 


CUUGACGCUU 


UCUCUCUGCA CACAUACACU 


CCCCACGAAC 


UGACACGGGU 


9120 


GGCUUCAGCC 


CUCAGAAAAC 


UUGGGGCGCC ACCCCUCAGA 


GCGUGGAAGA 


GCCGGGCACG 


9180 


UGCAGUCAGG 


GCGUCCCUCA 


UCUCCCGUGG GGGGAGAGCG 


GCCGUUUGCG 


GCCGAUAUCU 


9240 


CUUCAACUGG 


GCGGUGAAGA 


CCAAGCUCAA ACUCACUCCA 


UUGCCGGAAG 


CGCGCCUCCU 


9300 


GGAUUUAUCC 


AGCUGGUUCA 


CUGUCGGCGC CGGCGGGGGC 


GACAUUUAUC 


ACAGCGUGUC 


9360 


GCGUGCCCGA 


CCCCGCliUAU 


UACUCCUUGG CCUACUCCUA 


CUUUUUGUAG 


GGGUAGGCCU 


9420 


UUUCCUACUC 


CCCGCUCGGU 


AGAGCGGCAC ACAUUAGCUA 


CACUCCAUAG 


CUAACUGUCC 


9480 


CUUUUUUUUU 


UUUUUUUUUU 


UUUUUUUUUU UUUUUUUUUU 


UUUUUUUUUU 


UUUUUUUUUU 


9540 


uuuuuuuuuu 


uuuuuuuuuu 


UUUUUUUUUU UUUUUUUUUU 


UUUUUUUUU 


9589 
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Sequence ID No.2 
Sequence Length: 9,589 
Sequence Type: nucleic acid 
Strandedness: single 
Topology: linear 

Molecule Type: cDNA to genomic RNA 
Method for Determination of Feature: E 



ACCCGCCCCT AATAGGGGCG ACACTCCGCC ATGAACCACT CCCCTGTGAG GAACTACTGT 60 

CTTCACGCAG AAAGCGTCTA GCCATGGCGT TAGTATGAGT GTCGTACAGC CTCCAGGCCC 120 

CCCCCTCCCG GGAGAGCCAT AGTGGTCTGC GGAACCGGTG AGTACACCGG AATTGCCGGG 180 

AAGACTGGGT CCTTTCTTGG ATAAACCCAC TCTATGCCCG GTCATTTGGG CGTGCCCCCG 240 

CAAGACTGCT AGCCGAGTAG CGTTGGGTTG CGAAAGGCCT TGTGGTACTG CCTGATAGGG 300 

TGCTTGCGAG TGCCCCGGGA GGTCTCGTAG ACCGTGCACC ATGAGCACAA ATCCTAAACC 360 

TCAAAGAAAA ACCAAAAGAA ACACCAACCG TCGCCCACAA GACGTTAAGT TTCCGGGCGG 420 

CGGCCAGATC GTTGGCGGAG TATACTTGTT GCCGCGCAGG GGCCCCAGGT TGGGTGTGCG 480 

CGCGACAAGG AAGACTTCGG AGCGGTCCCA GCCACGTGGA AGGCGCCAGC CCATCCCTAA 540 

GGATCGGCGC TCCACTGGCA AATCCTGGGG AAAACCAGGA TACCCCTGGC CCCTATACGG 600 

GAATGAGGGA CTCGGCTGGG CAGGATGGCT CCTGTCCCCC CGAGGTTCCC GTCCCTCTTG 660 

GGGCCCCAAT GACCCCCGGC ATAGGTCCCG CAACGTGGGT AAGGTCATCG ATACCCTAAC 720 

GTGCGGCTTT GCCGACCTCA TGGGGTACAT CCCTGTCGTA GGCGCCCCGC TCGGCGGCGT 780 

CGCCAGAGCT CTCGCGCATG GCGTGAGAGT CCTGGAGGAC GGGGTTAATT TTGCAACAGG 840 

GAACTTACCC GGTTGCTCCT TTTCTATCTT CTTGCTGGCC CTGCTGTCCT GCATCACCAC 900 

CCCGGTCTCC GCTGCCGAAG TGAAGAACAT CAGTACCGGC TACATGGTGA CCAACGACTG 960 

CACCAATGAT AGCATTACCT GGCAACTCCA GGCTGCTGTC CTCCACGTCC CCGGGTGCGT 1020 

CCCGTGCGAG AAAGTGGGGA ATACATCTCG GTGCTGGATA CCGGTCTCAC CGAATGTGGC 1080 

CGTGCAGCAG CCCGGCGCCC TCACGCAGGG CTTACGGACG CACATTGACA TGGTTGTGAT 1140 

GTCCGCCACG CTCTGCTCCG CTCTTTACGT GGGGGACCTC TGCGGTGGGG TGATGCTTGC 1200 
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AGCCCAGATG TTCATTGTCT CGCCACAGCA CCACTGGTTT GTGCAAGACT GCAATTGCTC 1260 
CATCTACCCT GGTACCATCA CTGGACACCG CATGGCGTGG GACATGATGA TGAACTGGTC 1320 
GCCCACGGCT ACCATGATCC TGGCGTACGC GATGCGCGTC CCCGAGGTCA TCATAGACAT 1380 
CATTGGCGGG GCTCATTGGG GCGTCATGTT CGGCTTAGCC TACTTCTCTA TGCAGGGAGC. 1440 
GTGGGCAAAA GTCGTTGTCA TTCTTTTGCT GGCCGCCGGG GTGGACGCGC AAACCCATAC 1500 
CGTTGGGGGT TCTACCGCGC ATAACGCCAG GACCCTCACC GGCATGTTCT CCCTTGGTGC 1560 
CAGGCAGAAA ATCCAGCTCA TCAACACCAA TGGCAGTTGG CACATCAACC GCACCGCCCT 1620 
GAACTGCAAT GACTCTTTGC ACACCGGCTT CCTCGCGTCA CTGTTCTACA CCCACAGCTT 1680 
CAACTCGTCA GGATGTCCCG AACGCATGTC CGCCTGCCGC AGTATCGAGG CCTTTCGGGT 1740 
GGGATGGGGC GCCTTACAAT ATGAGGACAA TGTCACCAAT CCAGAGGATA TGAGACCGTA 1800 
TTGCTGGCAC TACCCACCAA GACAGTGTGG TGTAGTCTCC GCGAGCTCTG TGTGTGGCCC 1860 
AGTGTACTGT TTCACCCCCA GCCCAGTAGT AGTGGGTACG ACCGATAGAC TTGGAGCGCC 1920 
CACTTACACG TGGGGGGAGA ATGAGACAGA TGTCTTCCTA TTGAACAGCA CTCGACCACC 1980 
GCAGGGGTCA TGGTTCGGCT GCACGTGGAT GAACTCCACT GGCTACACCA AGACTTGCGG 2040 
CGCACCACCC TGCCGCATTA GAGCTGACTT CAATGCCAGC ATGGACTTGT TGTGCCCCAC 2100 
GGACTGTTTT AGGAAGCATC CTGATACCAC CTACATCAAA TGTGGCTCTG GGCCCTGGCT 2160 
CACGCCAAGG TGCCTGATCG ACTACCCCTA CAGGCTCTGG CATTACCCCT GCACAGTTAA 2220 
CTATACCATC TTCAAAATAA GGATGTATGT GGGGGGGGTC GAGCACAGGC TCACGGCTGC 2280 
GTGCAATTTC ACTCGTGGGG ATCGTTGCAA CTTGGAGGAC AGAGACAGAA GTCAACTGTC 2340 
TCCTTTGCTG CACTCCACCA CGGAGTGGGC CATTTTACCT TGCACTTACT CGGACCTGCC 2400 
CGCCTTGTCG ACTGGTCTTC TCCACCTCCA CCAAAACATC GTGGACGTGC AATTCATGTA 2460 
TGGCCTATCA CCTGCTCTCA CAAAATACAT CGTCCGATGG GAGTGGGTAG TACTCTTATT 2520 
CCTGCTCTTA GCGGACGCCA GGGTTTGCGC CTGCTTATGG ATGCTCATCT TGTTGGGCCA 2580 
GGCCGAAGCA GCACTAGAGA AGTTGGTCGT CTTGCACGCT GCGAGCGCAG CTAGCTGCAA 2640 
TGGCTTCCTA TACTTTGTCA TCTTTTTCGT GGCTGCTTGG TACATCAAGG GTCGGGTAGT 2700 
CCCCTTGGCT ACTTATTCCC TCACTGGCCT ATGGTCCTTT GGCCTACTGC TCCTAGCATT 2760 
GCCCCAACAG GCTTATGCTT ATGACGCATC TGTACATGGT CAGATAGGAG CAGCTCTGTT 2820 
GGTACTGATC ACTCTCTTTA CACTCACCCC CGGGTATAAG ACCCTTCTCA GCCGGTTTCT 2880 
GTGGTGGTTG TGCTATCTTC TGACCCTGGC GGAAGCTATG GTCCAGGAGT GGGCACCACC 2940 
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TATGCAGGTG CGCGGTGGCC GTGATGGGAT CATATGGGCC GTCGCCATAT TCTGCCCGGG 3000 
TGTGGTGTTT GACATAACCA AGTGGCTCTT GGCGGTGCTT GGGCCTGCTT ATCTCCTAAA 3060 
AGGTGCTTTG ACGCGTGTGC CGTACTTCGT CAGGGCTCAC GCTCTACTAA GGATGTGCAC 3120 
CATGGTAAGG CATCTCGCGG GGGGTAGGTA CGTCCAGATG GTGCTACTAG CCCTTGGCAG .3180 
GTGGACTGGC ACTTACATCT ATGACCACCT CACCCCTATG TCGGATTGGG CTGCTAATGG 3240 
CCTGCGGGAC TTGGCGGTCG CCGTGGAGCC TATCATCTTC AGTCCGATGG AGAAAAAAGT 3300 
CATCGTCTGG GGAGCGGAGA CAGCTGCTTG CGGGGATATC TTACACGGAC TTCCCGTGTC 3360 
CGCCCGACTT GGCCGGGAGG TCCTCCTTGG CCCAGCTGAT GGCTATACCT CCAAGGGGTG 3420 
GAGTCTTCTC GCCCCCATCA CTGCTTATGC CCAGCAGACA CGCGGCCTTT TGGGCACCAT 3480 
AGTGGTGAGC ATGACGGGGC GCGACAAGAC AGAACAGGCC GGGGAGATTC AGGTCCTGTC 3540 
CACGGTCACT CAGTCCTTCC TCGGAACAAC CATCTCGGGG GTCTTATGGA CTGTCTACCA 3600 
TGGAGCTGGC AACAAGACTC TAGCCGGCTC ACGGGGTCCG GTCACACAGA TGTACTCCAG 3660 
TGCTGAGGGG GACTTAGTGG GGTGGCCCAG CCCCCCCGGG ACCAAATCTT TGGAGCCGTG 3720 
CACGTGTGGA GCGGTCGACC TATACCTGGT CACGCGAAAC GCTGATGTCA TCCCGGCTCG 3780 
AAGACGCGGG GACAAGCGAG GAGCGCTACT CTCCCCGAGA CCTCTTTCCA CCTTGAAGGG 3840 
GTCCTCGGGG GGCCCGGTGC TCTGCCCCAG AGGCCACGCT GTCGGGGTCT TCCGGGCAGC 3900 
CGTGTGCTCC CGGGGCGTGG CCAAGTCCAT AGATTTTATC CCCGTTGAGA CACTTGACAT 3960 
CGTCACTCGG TCCCCCACCT TTAGTGACAA CAGCACACCA CCTGCTGTGC CCCAAACTTA 4020 
TCAGGTCGGG TACTTACATG CCCCGACTGG TAGTGGAAAG AGCACCAAAG TCCCTGTCGC 4080 
GTATGCCGCT CAGGGGTACA AAGTGCTAGT GCTTAATCCC TCGGTGGCTG CCACCCTGGG 4140 
GTTTGGGGCG TACTTGTCCA AGGCACATGG CATCAATCCC AACATTAGGA CTGGGGTCAG 4200 
GACTGTGACG ACCGGGGCGC CCATCACGTA CTCCACATAT GGCAAATTCC TCGCCGATGG 4260 
GGGCTGCGCA GGCGGCGCCT ATGACATCAT CATATGCGAT GAATGCCATG CCGTGGACTC 4320 
TACCACCATT CTCGGCATCG GAACAGTCCT CGATCAAGCA GAGACAGCCG GGGTCAGGCT 4380 
AACTGTACTG GCTACGGCTA CGCCCCCCGG GTCAGTGACA ACCCCCCACC CCAACATAGA 4440 
GGAGGTGGCC CTCGGGCAGG AGGGTGAGAT CCCCTTCTAT GGGAGGGCGA TTCCCCTGTC 4500 
ATACATCAAG GGAGGAAGAC ACTTGATCTT CTGCCACTCA AAGAAAAAGT GTGACGAGCT 4560 
CGCGGCGGCC CTTCGGGGTA TGGGCTTGAA CGCAGTGGCA TACTACAGAG GGCTGGACGT 4620 
CTCCGTAATA CCAACTCAGG GAGACGTAGT GGTCGTCGCC ACCGACGCCC TCATGACGGG 4680 
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GTTTACTGGA GACTTTGACT CCGTGATCGA CTGCAACGTA GCGGTCACTC AAGTTGTAGA 4740 
CTTCAGCTTG GACCCCACAT TCACCATAAC CACACAGACT GTCCCTCAAG ACGCTGTCTC 4800 
ACGTAGCCAG CGCCGGGGCC GCACGGGCAG GGGAAGACTG GGTATTTATA GGTATGTTTC 4860 
CACTGGTGAG CGAGCCTCAG GAATGTTTGA CAGTGTAGTG CTCTGCGAGT GCTACGATGC. 4920 
AGGGGCCGCA TGGTATGAGC TCACACCAGC GGAGACCACC GTCAGGCTCA GAGCATATTT 4980 
CAACACACCT GGTTTGCCTG TGTGCCAAGA CCATCTTGAG TTTTGGGAGG CAGTTTTCAC 5040 
CGGCCTCACA CACATAGATG CCCACTTCCT TTCCCAAACA AAGCAATCGG GGGAAAATTT 5100 
CGCATACTTA ACAGCCTACC AGGCTACAGT GTGCGCTAGG GCCAAAGCCC CCCCCCCGTC 5160 
CTGGGACGTC ATGTGGAAGT GTTTGACTCG ACTCAAGCCC ACACTCGTGG GCCCCACACC 5220 
TCTCCTGTAC CGCTTGGGCT CTGTTACCAA CGAGGTCACC CTCACGCATC CTGTGACGAA 5280 
ATACATCGCC ACCTGCATGC AAGCCGACCT TGAGGTCATG ACCAGCACGT GGGTCTTAGC 5340 
TGGGGGGGTC TTGGCGGCCG TCGCCGCGTA CTGCCTGGCG ACCGGGTGTG TTTGCATCAT 5400 
CGGCCGCTTG CACGTTAACC AGCGAGCCGT CGTTGCACCG GACAAGGAGG TCCTCTATGA 5460 
GGCTTTTGAT GAGATGGAGG AATGTGCCTC TAGAGCGGCT CTCATTGAAG AGGGGCAGCG 5520 
GATAGCCGAG ATGCTGAAGT CCAAGATCCA AGGCTTATTG CAGCAAGCTT CCAAACAAGC 5580 
TCAAGACATA CAACCCGCTG TGCAGGCTTC TTGGCCCAAG GTAGAGCAAT TCTGGGCCAA 5640 
ACACATGTGG AACTTCATCA GCGGCATTCA ATACCTCGCA GGACTATCAA CACTGCCAGG 5700 
GAACCCTGCT GTAGCTTCCA TGATGGCATT CAGTGCCGCC CTCACCAGTC CGTTGTCAAC 5760 
TAGCACCACT ATCCTTCTCA ACATTTTGGG GGGCTGGCTA GCATCCCAAA TTGCGCCTCC 5820 
CGCGGGGGCT ACCGGCTTCG TCGTCAGTGG CCTGGTGGGG GCTGCCGTAG GCAGCATAGG 5880 
CRGGGTAAG GTGCTGGTGG ACATCCTGGC AGGGTATGGT GCGGGCATTT CGGGGGCTCT 5940 
CGTCGCATTC AAGATCATGT CTGGCGAGAA GCCCTCCATG GAGGATGTTG TCAACCTGCT 6000 
GCCTGGAATT CTGTCTCCGG GTGCCCTGGT GGTGGGAGTC ATCTGCGCGG CCATCCTGCG 6060 
CCGACACGTG GGACCGGGGG AAGGCGCTGT CCAATGGATG AATAGGCTCA TTGCCTTTGC 6120 
TTCCAGAGGA AACCACGTCG CCCCCACCCA CTACGTGACG GAGTCGGATG CGTCGCAGCG 6180 
TGTGACCCAA CTACTTGGCT CCCTTACCAT AACCAGCCTG CTCAGGAGAC TCCACAACTG 6240 
GATTACTGAA GACTGCCCCA TCCCATGCAG CGGCTCGTGG CTCCGCGATG TGTGGGATTG 6300 
GGTTTGCACC ATCCTAACAG ACTTTAAAAA CTGGCTGACC TCCAAATTGT TCCCAAAGAT 6360 
GCCTGGTCTC CCCTTTATCT CTTGTCAAAA GGGGTACAAG GGCGTGTGGG CTGGCACTGG 6420 



30 



EP 0 532 167 A2 



TATCATGACC ACACGGTGTC CTTGCGGCGC CAATATCTCT GGCAATGTCC GCCTGGGCTC 6480 
CATGAGAATT ACGGGGCCCA AAACCTGCAT GAATATCTGG CAGGGGACCT TTCCCATCAA 6540 
TTGTTACACG GAGGGCCAGT GCGTGCCGAA ACCCGCACCA AACTTTAAGA TCGCCATCTG 6600 
GAGGGTGGCG GCCTCAGAGT ACGCGGAGGT GACGCAGCAC GGGTCATACC ACTACATAAC. 6660 
AGGACTTACC ACTGATAACT TGAAAGTTCC TTGCCAACTA CCTTCTCCAG AGTTCTTTTC 6720 
CTGGGTGGAC GGAGTGCAGA TCCATAGGTT TGCCCCCATA CCGAAGCCGT TTTTTCGGGA 6780 
TGAGGTCTCG TTCTGCGTTG GGCTTAATTC ATTTGTCGTC GGGTCTCAGC TCCCTTGGGA 6840 
TCCTGAACCT GACACAGACG TATTGACGTC CATGCTAACA GACCCATCCC ATATCACGGC 6900 
GGAGACTGCA GCGCGGCGTT TGGCACGGGG GTCACCCCCG TCCGAGGCAA GCTCCTCAGC 6960 
GAGCCAGCTA TCGGCACCAT CGCTGCGAGC CACCTGCACC ACCCACGGCA AGGCCTATGA 7020 
TGTGGACATG GTGGATGCCA ACCTGTTCAT GGGGGGCGAT GTGACCCGGA TAGAGTCTGA 7080 
GTCCAAAGTG GTCGTTCTGG ACTCTCTCGA CCCAATGGTC GAAGAAAGGA GCGACCTTGA 7140 
GCCTTCGATA CCATCGGAAT ATATGCTCCC CAAGAAGAGA TTCCCACCAG CCTTACCGGC 7200 
TTGGGCACGG CCTGATTACA ACCCACCGCT TGTGGAATCG TGGAAGAGGC CAGATTACCA 7260 
ACCGGCCACT GTTGCGGGCT GCGCTCTCCC CCCCCCTAAG AAAACCCCGA CGCCTCCCCC 7320 
AAGGAGACGC CGGACAGTGG GTCTGAGTGA GAGCTCCATA GCAGATGCCC TACAACAGCT 7380 
GGCCATCAAG TCCTTTGGCC AGCCCCCCCC AAGCGGCGAT TCAGGCCTTT CCACGGGGGC 7440 
GGACGCAGCC GATTCCGGCA GTCGGACGCC CCCCGATGAG TTGGCCCTTT CGGAGACAGG 7500 
TTCCATCTCC TCCATGCCCC CTCTCGAGGG GGAGCCTGGA GATCCAGACT TGGAGCCTGA 7560 
GCAGGTAGAG CTTCAACCTC CCCCCCAGGG GGGGGTGGTA ACCCCCGGCT CAGGCTCGGG 7620 
GTCTTGGTCT ACTTGCTCCG AGGAGGACGA CTCCGTCGTG TGCTGCTCCA TGTCATACTC 7680 
CTGGACCGGG GCTCTAATAA CTCCTTGTAG CCCCGAAGAG GAAAAGTTGC CAATTGGCCC 7740 
CTTGAGCAAC TCCCTGTTGC GATATCACAA CAAGGTGTAC TGTACCACAT CAAAGAGCGC 7800 
CTCATTAAGG GCTAAAAAGG TAACTTTTGA TAGGATGCAA GCGCTCGACG CTCATTATGA 7860 
CTCAGTCTTG AAGGACATTA AGCTAGCGGC CTCCAAGGTC ACCGCAAGGC TTCTCACTTT 7920 
AGAGGAGGCC TGCCAGTTAA CTCCACCCCA CTCTGCAAGA TCCAAGTATG GGTTTGGGGC 7980 
TAAGGAGGTC CGCAGCTTGT CCGGGAGAGC CGTTAACCAC ATCAAGTCCG TGTGGAAGGA 8040 
CCTCCTGGAA GACACACAAA CACCAATTCC TACAACCATC ATGGCCAAAA ATGAGGTGTT 8100 
CTGCGTGGAC CCCACCAAGG GGGGTAAGAA AGCAGCTCGC CTTATCGTTT ACCCTGACCT 8160 
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CGGCGTCAGG GTCTGCGAGA AAATGGCCCT TTATGATATC ACACAAAAGC TTCCTGAGGC 8220 
GGTGATGGGG GCTTCTTATG GATTCCAGTA CTCCCCCGCT CAGCGGGTGG AGTTTCTCTT 8280 
GAAGGCATGG GCGGAAAAGA AAGACCCTAT GGGTTTTTCG TATGATACCC GATGCTTTGA 8340 
CTCAACCGTC ACTGAGAGAG ACATCAGGAC TGAGGAGTCC ATATATCGGG CTTGTTCCTT. 8400 
GCCCGAGGAG GCCCACACTG CCATACACTC ACTGACTGAG AGACTTTACG TGGGAGGGCC 8460 
CATGTTCAAC AGCAAGGGCC AGACCTGCGG GTACAGGCGT TGCCGCGCCA GCGGGGTGCT 8520 
TACCACTAGC ATGGGGAACA CCATCACATG CTATGTGAAA GCCTTAGCGG CCTGTAAGGC 8580 
TGCAGGGATA ATTGCGCCCA CAATGCTGGT ATGCGGCGAT GACTTGGTTG TCATCTCAGA 8640 
GAGCCAGGGG ACCGAGGAGG ACGAGCGGAA CCTGAGAGCC TTCACGGAGG CTATGACCAG 8700 
GTATTCTGCC CCTCCTGGTG ACCCCCCCAG ACCGGAATAT GACCTGGAGC TGATAACATC 8760 
TTGCTCCTCA AATGTGTCTG TGGCGTTGGG CCCACAAGGC CGCCGCAGAT ACTACCTGAC 8820 
CAGAGACCCT ACCACTCCAA TCGCCCGGGC TGCCTGGGAA ACAGTTAGAC ACTCCCCTGT 8880 
CAATTCATGG CTAGGAAACA TCATCCAGTA CGCCCCAACC ATATGGGCTC GCATGGTCCT 8940 
GATGACACAC TTCTTCTCCA TTCTCATGGC CCAAGATACT CTGGACCAGA ACCTCAACTT 9000 
TGAGATGTAC GGAGCGGTGT ACTCCGTGAG TCCCTTGGAC CTCCCAGCCA TAATTGAAAG 9060 
GTTACACGGG CTTGACGCTT TCTCTCTGCA CACATACACT CCCCACGAAC TGACACGGGT 9120 
GGCTTCAGCC CTCAGAAAAC TTGGGGCGCC ACCCCTCAGA GCGTGGAAGA GCCGGGCACG 9180 
TGCAGTCAGG GCGTCCCTCA TCTCCCGTGG GGGGAGAGCG GCCGTTTGCG GCCGATATCT 9240 
CTTCAACTGG GCGGTGAAGA CCAAGCTCAA ACTCACTCCA TTGCCGGAAG CGCGCCTCCT 9300 
OGATTTATCC AGCTGGTTCA CTGTCGGCGC CGGCGGGGGC GACATTTATC ACAGCGTGTC 9360 
GCGTGCCCGA CCCCGCTTAT TACTCCTTGG CCTACTCCTA CTTTTTGTAG GGGTAGGCCT 9420 
TTTCCTACTC CCCGCTCGGT AGAGCGGCAC ACATTAGCTA CACTCCATAG CTAACTGTCC 9480 
CTTTTTTTTT TTTTTTTTTT TTTTTTTTTT TTTTTTTTTT TTTTTTTTTT TTTTTTTTTT 9540 
TTTTTTTTTT TTTTTTTTTT TTTTTTTTTT TTTTTTTTTT TTTTTTTTT 9589 
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Sequence ID No.3 
Sequence Length: 3,970 
Sequence Type: nucleic acid 
Slrandedness: single 
Topology: linear 

Molecule Type: cDNA to genomic RNA 
Method for Determination of Feature: E 



GGCATTACCC 


CTGCACAGTT 


AACTATACCA 


TCGAGCACAG 


GCTCACGGCT 


GCGTGCAATT 


ACAGAGACAG 


AAGTCAACTG 


TCTCCTTTGC 


CTTGCACTTA 


CTCGGACCTG 


CCCGCCTTGT 


TCGTGGACGT 


GCAATTCATG 


TATGGCCTAT 


GGGAGTGGGT 


AGTACTCTTA 


TTCCTGCTCT 


GGATGCTCAT 


CTTGTTGGGC 


CAGGCCGAAG 


CTGCGAGCGC 


AGCTAGCTGC 


AATGGCTTCC 


GGTACATCAA 


GGGTCGGGTA 


GTCCCCTTGG 


TTGGCCTACT 


GCTCCTAGCA 


TTGCCCCAAC 


GTCAGATAGG 


AGCAGCTCTG 


TTGGTACTGA 


AGACCCTTCT 


CAGCCGGTTT 


CTGTGGTGGT 


TGGTCCAGGA 


GTGGGCACCA 


CCTATGCAGG 


CCGTCGCCAT 


ATTCTGCCCG 


GGTGTGGTGT 


TTGGGCCTGC 


TTATCTCCTA 


AAAGGTGCTT 


ACGCTCTACT 


AAGGATGTGC 


ACCATGGTAA 


TGGTGCTACT 


AGCCCTTGGC 


AGGTGGACTG 


TGTCGGATTG 


GGCTGCTAAT 


GGCCTGCGGG 


TCAGTCCGAT 


GGAGAAAAAA 


GTCATCGTCT 


TCTTACACGG 


ACTTCCCGTG 


TCCGCCCGAC 



TCTTCAAAAT AAGGATGTAT GTGGGGGGGG 60 

TCACTCGTGG GGATCGTTGC AACTTGGAGG 120 

TGCACTCCAC CACGGAGTGG GCCATTTTAC 180 

CGACTGGTCT TCTCCACCTC CACCAAAACA 240 

CACCTGCTCT CACAAAATAC ATCGTCCGAT 300 

TAGCGGACGC CAGGGTTTGC GCCTGCTTAT 360 

CAGCACTAGA GAAGTTGGTC GTCTTGCACG 420 

TATACTTTGT CATCTTTTTC GTGGCTGCTT 480 

CTACTTATTC CCTCACTGGC CTATGGTCCT 540 

AGGCTTATGC TTATGACGCA TCTGTACATG 500 

TCACTCTCTT TACACTCACC CCCGGGTATA 660 

TGTGCTATCT TCTGACCCTG GCGGAAGCTA 720 

TGCGCGGTGG CCGTGATGGG ATCATATGGG 780 

TTGACATAAC CAAGTGGCTC TTGGCGGTGC 840 

TGACGCGTGT GCCGTACTTC GTCAGGGCTC 900 

GGCATCTCGC GGGGGGTAGG TACGTCCAGA 960 

GCACTTACAT CTATGACCAC CTCACCCCTA 1020 

ACTTGGCGGT CGCCGTGGAG CCTATCATCT 1080 

GGGGAGCGGA GACAGCTGCT TGCGGGGATA 1140 

TTGGCCGGGA GGTCCTCCTT GGCCCAGCTG 1200 



33 



EP 0 532 167 A2 



ATGGCTATAC CTCCAAGGGG TGGAGTCTTC 
CACGCGGCCT TTTGGGCACC ATAGTGGTGA 
CCGGGGAGAT TCAGGTCCTG TCCACGGTCA 
GGGTCTTATG GACTGTCTAC CATGGAGCTG 
CGGTCACACA GATGTACTCC AGTGCTGAGG 
GGACCAAATC TTTGGAGCCG TGCACGTGTG 
ACGCTGATGT CATCCCGGCT CGAAGACGCG 
GACCTCTTTC CACCTTGAAG GGGTCCTCGG 
CTGTCGGGGT CTTCCGGGCA GCCGTGTGCT 
TCCCCGTTGA GACACTTGAC ATCGTCACTC 
CACCTGCTGT GCCCCAAACT TATCAGGTCG 
AGAGCACCAA AGTCCCTGTC GCGTATGCCG 
CCTCGGTGGC TGCCACCCTG GGGTTTGGGG 
CCAACATTAG GACTGGGGTC AGGACTGTGA 
ATGGCAAATT CCTCGCCGAT GGGGGCTGCG 
ATGAATGCCA TGCCGTGGAC TCTACCACCA 
CAGAGACAGC CGGGGTCAGG CTAACTGTAC 
CAACCCCCCA CCCCAACATA GAGGAGGTGG 
ATGGGAGGGC GATTCCCCTG TCATACATCA 
CAAAGAAAAA GTGTGACGAG CTCGCGGCGG 
CATACTACAG AGGGCTGGAC GTCTCCGTAA 
CCACCGACGC CCTCATGACG GGGTTTACTG 
TAGCGGTCAC TCAAGTTGTA GACTTCAGCT 
CTGTCCCTCA AGACGCTGTC TCACGTAGCC 
TGGGTATTTA TAGGTATGTT TCCACTGGTG 
TGCTCTGCGA GTGCTACGAT GCAGGGGCCG 
CCGTCAGGCT CAGAGCATAT TTCAACACAC 
AGTTTTGGGA GCAGTTTTC ACCGGCCTCA 
CAAAGCAATC GGGGGAAAAT TTCGCATACT 



TCGCCCCCAT CACTGCTTAT GCCCAGCAGA 1260 
GCATGACGGG GCGCGACAAG ACAGAACAGG 1320 
CTCAGTCCTT CCTCGGAACA ACCATCTCGG 1380 
GCAACAAGAC TCTAGCCGGC TCACGGGGTC 1440 
GGGACTTAGT GGGGTGGCCC AGCCCCCCCG 1500 
GAGCGGTCGA CCTATACCTG GTCACGCGAA 1560 
GGGACAAGCG AGGAGCGCTA CTCTCCCCGA 1620 
GGGGCCCGGT GCTCTGCCCC AGAGGCCACG 1680 
CCCGGGGCGT GGCCAAGTCC ATAGATTTTA 1740 
GGTCCCCCAC CTTTAGTGAC AACAGCACAC 1800 
GGTACTTACA TGCCCCGACT GGTAGTGGAA 1860 
CTCAGGGGTA CAAAGTGCTA GTGCTTAATC 1920 
CGTACTTGTC CAAGGCACAT GGCATCAATC 1980 
CGACCGGGGC GCCCATCACG TACTCCACAT 2040 
CAGGCGGCGC CTATGACATC ATCATATGCG 2100 
TTCTCGGCAT CGGAACAGTC CTCGATCAAG 2160 
TGGCTACGGC TACGCCCCCC GGGTCAGTGA 2220 
CCCTCGGGCA GGAGGGTGAG ATCCCCTTCT 2280 
AGGGAGGAAG ACACTTGATC TTCTGCCACT 2340 
CCCTTCGGGG TATGGGCTTG AACGCAGTGG 2400 
TACCAACTCA GGGAGACGTA GTGGTCGTCG 2460 
GAGACTTTGA CTCCGTGATC GACTGCAACG 2520 
TGGACCCCAC ATTCACCATA ACCACACAGA 2580 
AGCGCCGGGG CCGCACGGGC AGGGGAAGAC 2640 
AGCGAGCCTC AGGAATGTTT GACAGTGTAG 2700 
CATGGTATGA GCTCACACCA GCGGAGACCA 2760 
CTGGTTTGCC TGTGTGCCAA GACCATCTTG 2820 
CACACATAGA TGCCCACTTC CTTTCCCAAA 2880 
TAACAGCCTA CCAGGCTACA GTGTGCGCTA 2940 
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GGGCCAAAGC CCCCCCCCCG TCCTGGGACG TCATGTGGAA GTGTTTGACT CGACTCAAGC 3000 
CCACACTCGT GGGCCCCACA CCTCTCCTGT ACCGCTTGGG CTCTGTTACC AACGAGGTCA 3060 
CCCTCACGCA TCCTGTGACG AAATACATCG CCACCTGCAT GCAAGCCGAC CTTGAGGTCA 3120 
TGACCAGCAC GTGGGTCTTA GCTGGGGGGG TCTTGGCGGC CGTCGCCGCG TACTGCCTGG.3180 
CGACCGGGTG TGTTTGCATC ATCGGCCGCT TGCACGTTAA CCAGCGAGCC GTCGTTGCAC 3240 
CGGACAAGGA GGTCCTCTAT GAGGCTTTTG ATGAGATGGA GGAATGTGCC TCTAGAGCGG 3300 
CTCTCATTGA AGAGGGGCAG CGGATAGCCG AGATGCTGAA GTCCAAGATC CAAGGCTTAT 3360 
TGCAGCAAGC TTCCAAACAA GCTCAAGACA TACAACCCGC TGTGCAGGCT TCTTGGCCCA 3420 
AGGTAGAGCA ATTCTGGGCC AAACACATGT GGAACTTCAT CAGCGGCATT CAATACCTCG 3480 
CAGGACTATC AACACTGCCA GGGAACCCTG CTGTAGCTTC CATGATGGCA TTCAGTGCCG 3540 
CCCTCACCAG TCCGTTGTCA ACTAGCACCA CTATCCTTCT CAACATTTTG GGGGGCTGGC 3600 
TAGCATCCCA AATTGCGCCT CCCGCGGGGG CTACCGGCTT CGTCGTCAGT GGCCTGGTGG 3660 
GGGCTGCCGT AGGCAGCATA GGCTTGGGTA AGGTGCTGGT GGACATCCTG GCAGGGTATG 3720 
GTGCGGGCAT TTCGGGGGCT CTCGTCGCAT TCAAGATCAT GTCTGGCGAG AAGCCCTCCA 3780 
TGGAGGATGT TGTCAACCTG CTGCCTGGAA TTCTGTCTCC GGGTGCCCTG GTGGTGGGAG 3840 
TCATCTGCGC GGCCATCCTG CGCCGACACG TGGGACCGGG GGAAGGCGCT GTCCAATGGA 3900 
TGAATAGGCT CATTGCCTTT GCTTCCAGAG GAAACCACGT CGCCCCCACC CACTACGTGA 3960 
CGGAGTCGGA 3970 
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Sequence ID No.4 
Sequence Length: 2,693 
Sequence Type: nucleic acid 
Strandedness: single 
Topology: linear 

Molecule Type: cDNA to genomic RNA 
Method for Determination of Feature: E 



ATTCTGTCTC CGGGTGCCCT GGTGGTGGGA GTCATCTGCG CGGCCATCCT GCGCCGACAC 60 

GTGGGACCGG GGGAAGGCGC TGTCCAATGG ATGAATAGGC TCATTGCCTT TGCTTCCAGA 120 

GGAAACCACG TCGCCCCCAC CCACTACGTG ACGGAGTCGG ATGCGTCGCA GCGTGTGACC 180 

CAACTACTTG GCTCCCTTAC CATAACCAGC CTGCTCAGGA GACTCCACAA CTGGATTACT 240 

GAAGACTGCC CCATCCCATG CAGCGGCTCG TGGCTCCGCG ATGTGTGGGA TTGGGTTTGC 300 

ACCATCCTAA CAGACTTTAA AAACTGGCTG ACCTCCAAAT TGTTCCCAAA GATGCCTGGT 360 

CTCCCCTTTA TCTCTTGTCA AAAGGGGTAC AAGGGCGTGT GGGCTGGCAC TGGTATCATG 420 

ACCACACGGT GTCCTTGCGG CGCCAATATC TCTGGCAATG TCCGCCTGGG CTCCATGAGA 480 

ATTACGGGGC CCAAAACCTG CATGAATATC TGGCAGGGGA CCTTTCCCAT CAATTGTTAC 540 

ACGGAGGGCC AGTGCGTGCC GAAACCCGCA CCAAACTTTA AGATCGCCAT CTGGAGGGTG 600 

GCGGCCTCAG AGTACGCGGA GGTGACGCAG CACGGGTCAT ACCACTACAT AACAGGACTT 660 

ACCACTGATA ACTTGAAAGT TCCTTGCCAA CTACCTTCTC CAGAGTTCTT TTCCTGGGTG 720 

GACGGAGTGC AGATCCATAG GTTTGCCCCC ATACCGAAGC CGTTTTTTCG GGATGAGGTC 780 

TCGTTCTGCG TTGGGCTTAA TTCATTTGTC GTCGGGTCTC AGCTCCCTTG CGATCCTGAA 840 

CCTGACACAG ACGTATTGAC GTCCATGCTA ACAGACCCAT CCCATATCAC GGCGGAGACT 900 

GCAGCGCGGC GTTTGGCACG GGGGTCACCC CCGTCCGAGG CAAGCTCCTC AGCGAGCCAG 960 

CTATCGGCAC CATCGCTGCG AGCCACCTGC ACCACCCACG GCAAGGCCTA TGATGTGGAC 102ft. 

ATGGTGGATG CCAACCTGTT CATGGGGGGC GATGTGACCC GGATAGAGTC TGAGTCCAAA 1080 

GTGGTCGTTC TGGACTCTCT CGACCCAATG GTCGAAGAAA GGAGCGACCT TGAGCCTTCG 1140 

ATACCATCGG AATATATGCT CCCCAAGAAG AGATTCCCAC CAGCCTTACC GGCTTGGGCA 1200 
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CGGCCTGATT ACAACCCACC GCTTGTGGAA TCGTGGAAGA GGCCAGATTA CCAACCGGCC 1260 
ACTGTTGCGG GCTGCGCTCT CCCCCCCCCT AAGAAAACCC CGACGCCTCC CCCAAGGAGA 1320 
CGCCGGACAG TGGGTCTGAG TGAGAGCTCC ATAGCAGATG CCCTACAACA GCTGGCCATC 1380 
AAGTCCTTTG GCCAGCCCCC CCCAAGCGGC GATTCAGGCC TTTCCACGGG GGCGGACGCA 1440 
GCCGATTCCG GCAGTCGGAC GCCCCCCGAT GAGTTGGCCC TTTCGGAGAC AGGTTCCATC 1500 
TCCTCCATGC CCCCTCTCGA GGGGGAGCCT GGAGATCCAG ACTTGGAGCC TGAGCAGGTA 1560 
GAGCTTCAAC CTCCCCCCCA GGGGGGGGTG GTAACCCCCG GCTCAGGCTC GGGGTCTTGG 1620 
TCTACTTGCT CCGAGGAGGA CGACTCCGTC GTGTGCTGCT CCATGTCATA CTCCTGGACC 1680 
GGGGCTCTAA TAACTCCTTG TAGCCCCGAA GAGGAAAAGT TGCCAATTGG CCCCTTGAGC 1740 
AACTCCCTGT TGCGATATCA CAACAAGGTG TACTGTACCA CATCAAAGAG CGCCTCATTA 1800 
AGGGCTAAAA AGGTAACTTT TGATAGGATG CAAGCGCTCG ACGCTCATTA TGACTCAGTC 1860 
TTGAAGGACA TTAAGCTAGC GGCCTCCAAG GTCACCGCAA GGCTTCTCAC TTTAGAGGAG 1920 
GCCTGCCAGT TAACTCCACC CCACTCTGCA AGATCCAAGT ATGGGTTTGG GGCTAAGGAG 1980 
GTCCGCAGCT TGTCCGGGAG AGCCGTTAAC CACATCAAGT CCGTGTGGAA GGACCTCCTG 2040 
GAAGACACAC AAACACCAAT TCCTACAACC ATCATGGCCA AAAATGAGGT GTTCTGCGTG 2100 
GACCCCACCA AGGGGGGTAA GAAAGCAGCT CGCCTTATCG TTTACCCTGA CCTCGGCGTC 2160 
AGGGTCTGCG AGAAAATGGC CCTTTATGAT ATCACACAAA AGCTTCCTCA GGCGGTGATG 2220 
GGGGCTTCTT ATGGATTCCA GTACTCCCCC GCTCAGCGGG TGGAGTTTCT CTTGAAGGCA 2280 
TGGGCGGAAA AGAAAGACCC TATGGGTTTT TCGTATGATA CCCGATGCTT TGACTCAACC 2340 
GTCACTGAGA GAGACATCAG GACTGAGGAG TCCATATATC GGGCTTGTTC CTTGCCCGAG 2400 
GAGGCCCACA CTGCCATACA CTCACTGACT GAGAGACTTT ACGTGGGAGG GCCCATGTTC 2460 
AACAGCAAGG GCCAGAGCTG CGGGTACAGG CGTTGCCGCG CCAGCGGGGT GCTTACCACT 2520 
AGCATGGGGA ACACCATCAC ATGCTATGTG AAAGCCTTAG CGGCCTGTAA GGCTGCAGGG 2580 
ATAATTGCGC CCACAATGCT GGTATGCGGC GATGACTTGG TTGTCATCTC AGAGAGCCAG 2640 
GGGACCGAGG AGGACGAGCG GAACCTGAGA GCCTTCACGG AGGCTATGAC CAG 2693 
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Sequence ED No.5 
Sequence Length: 3,033 
Sequence Type: amino acid 
Topology: linear 
Molecule Type: protein 



Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr 

5 10 15 

Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin He 

20 25 30 

Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly 

35 40 45 

Val Arg Ala Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly 

50 55 60 

Arg Arg Gin Pro He Pro Lys Asp Arg Arg Ser Thr Gly Lys Ser 

65 70 75 

Trp Gly Lys Pro Gly Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly 

80 85 90 

i.eu Gly Trp Ala Gly Trp Leu Leu Ser Pro Arg Gly Ser Arg. Pro 

95 100 105 

Ser Trp Gly Pro Asn Asp Pro Arg His Arg Ser Arg Asn Val Gly 

110 115 120 

Lys Val He Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met Gly 

125 130 135 

Tyr lie Pro Val Val Gly Ala Pro Leu Gly Gly Val Ala Arg Ala 

HQ ' 145 150 

Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Val Asn Phe Ala 

155 160 165 
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Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser He Phe Leu Leu Ala 
170 180 
Leu Leu Ser Cys He Thr Thr Pro Val Ser Ala Ala Glu Val Lys 
185 190 195 

Asn He Ser Thr Gly Tyr Met Val Thr Asn Asp Cys Thr Asn Asp 
200 205 210 

Ser He Thr Trp Gin Leu Gin Ala Ala Val Leu His Val Pro Gly 
215 220 225 

Cys Val Pro Cys Glu Lys Val Gly Asn Thr Ser Arg Cys Trp He 
230 235 240 

Pro Val Ser Pro Asn Val Ala Val Gin Gin Pro Gly Ala Leu Thr 
245 250 255 

Gin Gly Leu Arg Thr His He Asp Het Val Val Het Ser Ala Thr 
260 265 270 

Leu Cys Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Gly Val Het 
275 280 2 85 

Leu Ala Ala Gin Het Phe He Val Ser Pro Gin His His Trp Phe 
290 295 300 

Val Gin Asp Cys Asn Cys Ser He Tyr Pro Gly Thr He Thr Gly 
305 310 315 

His Arg Het Ala Trp Asp Het Het Het Asn Trp Ser Pro Thr Ala 
320 325 330 

Thr Het He Leu Ala Tyr Ala Het Arg Val Pro Glu Val He He 
335 340 345 

Asp He He Gly Gly Ala His Trp Gly Val Het Phe Gly Leu Ala 
350 355 360 

Tyr Phe Ser Het Gin Gly Ala Trp Ala Lys Val Val Val lie Leu 
365 370 375 

Leu Leu Ala Ala Gly Val Asp Ala Gin Thr His Thr Val Gly Gly 



39 



EP 0 532 167 A2 



380 385 390 

Ser Thr Ala His Asn Ala Arg Thr Leu Thr Gly Het Phe Ser Leu 
395 400 405 

Gly Ala Arg Gin Lys He Gin Leu He Asn Thr Asn Gly Ser Trp 
410 415 420 

His lie Asn Arg Thr Ala Leu Asn Cys Asn Asp Ser Leu His Thr 
425 430 435 

Gly Phe Leu Ala Ser Leu Phe Tyr Thr His Ser Phe Asn Ser Ser 
440 445 450 

Gly Cys Pro Glu Arg Het Ser Ala Cys Arg Ser He Glu Ala Phe 
455 460 465 

Arg Val Gly Trp Gly Ala Leu Gin Tyr Glu Asp Asn Val Thr Asn 
470 475 480 

Pro Glu Asp Het Arg Pro Tyr Cys Trp His Tyr Pro Pro Arg Gin 
485 490 . 495 

Cys Gly Val Val Ser Ala Ser Ser Val Cys Gly Pro Val Tyr Cys 
500 505 510 

Phe Thr Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Leu Gly 
515 520 525 

Ala Pro Thr Tyr Thr Trp Gly Glu Asn Glu Thr Asp Val Phe Leu 
530 535 540 

Leu Asn Ser Thr Arg Pro Pro Gin Gly Ser Trp Phe Gly Cys Thr 
545 550 555 

Trp Het Asn Ser Thr Gly Tyr Thr Lys Thr Cys Gly Ala Pro Pro 
560 565 570 

Cys Arg lie Arg Ala Asp Phe Asn Ala Ser Het Asp Leu Leu Cys 
575 580 585 

Pro Thr Asp Cys Phe Arg Lys His Pro Asp Thr Thr Tyr He Lys 
590 595 600 
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Cys Gly Ser Gly Pro Trp Leu Thr Pro Arg Cys Leu He Asp Tyr 

605 610 615 

Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr Val Asn Tyr Thr He 

620 625 630 

Phe Lys He Arg Het Tyr Val Gly Gly Val Glu His Arg Leu Thr 

635 640 645 

Ala Ala Cys Asn Phe Thr Arg Gly Asp Arg Cys Asn Leu Glu Asp- 

650 655 660 

Arg Asp Arg Ser Gin Leu Ser Pro Leu Leu His Ser Thr Thr Glu 

665 670 675 

Trp Ala He Leu Pro Cys Thr Tyr Ser Asp Leu Pro Ala Leu Ser 

680 685 690 

Thr Gly Leu Leu His Leu His Gin Asn He Val Asp Val Gin Phe 

695 700 705 

Het Tyr Gly Leu Ser Pro Ala Leu Thr Lys Tyr He Val Arg Trp 

710 715 720 

Glu Trp Val Val Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val 

725 730 735 

Cys Ala Cys Leu Trp Het Leu He Leu Leu Gly Gin Ala Glu Ala 

740 745 750 

Ala Leu Glu Lys Leu Val Val Leu His Ala Ala Ser Ala Ala Ser 

755 760 765 

Cys Asn Gly Phe Leu Tyr Phe Val He Phe Phe Val Ala Ala Trp 

770 775 780 

Tyr lie Lys Gly Arg Val Val Pro Leu Ala Thr Tyr Ser Leu Thr 

785 790 795 

Gly Leu Trp Ser Phe Gly Leu Leu Leu Leu Ala Leu Pro Gin Gin 

800 805 810 

Ala Tyr Ala Tyr Asp Ala Ser Val His Gly Gin He Gly Ala Ala 
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815 820 825 

Leu Leu Val Leu He Thr Leu Phe Thr Leu Thr Pro Gly Tyr Lys 

830 835 840 

Thr Leu Leu Ser Arg Phe Leu Trp Trp Leu Cys Tyr Leu Leu Thr 

845 850 855 

Leu Ala Glu Ala Met Val Gin Glu Trp Ala Pro Pro Met Gin Val 

860 865 870 

Arg Gly Gly Arg Asp Gly He He Trp Ala Val Ala He Phe Cys 

875 880 885 

Pro Gly Val Val Phe Asp He Thr Lys Trp Leu Leu Ala Val Leu 

890 895 900 

Gly Pro Ala Tyr Leu Leu Lys Gly Ala Leu Thr Arg Val Pro Tyr 

905 910 915 

Phe Val Arg Ala His Ala Leu Leu Arg Het Cys Thr Met Val Arg 

920 925 930 

His Leu Ala Gly Gly Arg Tyr Val Gin Het Val Leu Leu Ala Leu 

935 940 945 

Gly Arg Trp Thr Gly Thr Tyr He Tyr Asp His Leu Thr Pro Het 

950 955 960 

Ser Asp Trp Ala Ala Asn Gly Leu Arg Asp Leu Ala Val Ala Val 

965 970 975 

Glu Pro He He Phe Ser Pro Het Glu Lys Lys Val He Val Trp 

980 985 990 

Gly Ala Glu Thr Ala Ala Cys Gly Asp He Leu His Gly Leu Pro 

995 1000 1005 

Val Ser Ala Arg Leu Gly Arg Glu Val Leu Leu Gly Pro Ala Asp 

1010 1015 1020 

Gly Tyr Thr Ser Lys Gly Trp Ser Leu Leu Ala Pro He Thr Ala 

1025 1030 1035 
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Tyr Ala Gin Gin Thr Arg Gly Leu Leu Gly Thr He Val Val Ser 
1040 1045 1050 

Het Thr Gly Arg Asp Lys Thr Glu Gin Ala Gly Glu He Glu Val 
^ 10 55 1060 1065 

Leu Ser Thr Val Thr Gin Ser Phe Leu Gly Thr Thr He Ser Gly 
1070 1075 1080 

Val Leu Trp Thr Val Tyr His Gly Ala Gly Asn Lys Thr Leu Ala 
1085 1090 1095 

Gly Ser Arg Gly Pro Val Thr Gin Het Tyr Ser Ser Ala Glu Gly 
1100 1105 mo 

Asp Leu Val Gly Trp Pro Ser Pro Pro Gly Thr Lys Ser Leu Glu 
1115 1120 1125 

Pro Cys Thr Cys Gly Ala Val Asp Leu Tyr Leu Val Thr Arg Asn 
1130 1135 11*0 

Ala Asp Val He Pro Ala Arg Arg Arg Gly Asp Lys Arg Gly Ala 
1145 1150 1155 

Leu Leu Ser Pro Arg Pro Leu Ser Thr Leu Lys Gly Ser Ser Gly 
1160 1165 1170 

Gly Pro Val Leu Cys Pro Arg Gly His Ala Val Gly Val Phe Arg 
1175 1180 1185 

Ala Ala Val Cys Ser Arg Gly Val Ala Lys Ser He Asp Phe lie 
1190 1195 1200 

Pro Val Glu Thr Leu Asp lie Val Thr Arg Ser Pro Thr Phe Ser 
1205 1210 1215 

Asp Asn Ser Thr Pro Pro Ala Val Pro Gin Thr Tyr Gin Val Gin 
1220 1225 1230 

Tyr Leu His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro 
1235 , 1240 1245 

Val Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro 
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1250 1255 1260 

Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Leu Ser Lys Ala 

1265 1270 1275 

His Gly He Asn Pro Asn lie Arg Thr Gly Val Arg Thr Val Thr 

1280 1285 1290 

Thr Gly Ala Pro He Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala 

1295 1300 1305 

Asp Gly Gly Cys Ala Gly Gly Ala Tyr Asp He He He Cys Asp 

1310 1315 1320 

Glu Cys His Ala Val Asp Ser Thr Thr He Leu Gly He Gly Thr 

1325 1330 1335 

Val Leu Asp Gin Ala Glu Thr Ala Gly Val Arg Leu Thr Val Leu 

1340 1345 1350 

Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Thr Pro His Pro Asn 

1355 1360 1365 

He Glu Glu Val Ala Leu Gly Gin Glu Gly Glu He Pro Phe Tyr 

1370 1375 1380 

Gly Arg Ala He Pro Leu Ser Tyr He Lys Gly Gly Arg His Leu 
1385 1390 1395 

lie Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Ala 
1400 1405 1410 

Leu Arg Gly Mel Gly Leu Asn Ala Val Ala Tyr Tyr Arg Gly Leu 

1415 1420 1425 

Asp Val Ser Val He Pro Thr Gin Gly Asp Val Val Val Val Ala 

1430 1435 1440 

Thr Asp Ala Leu Met Thr Gly Phe Thr Gly Asp Phe Asp Ser Val 

1445 1450 1455 

He Asp Cys Asn Val Ala Val Thr Gin Val Val Asp Phe Ser Leu 

1460 1465 1470 
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Asp Pro Thr Phe Thr He Thr Thr Gin Thr Val Pro Gin Asp Ala 

1475 1480 U85 

Val Ser Arg Ser Gin Arg Arg Gly Arg Thr Gly Arg Gly Arg Leu 

U90 1495 1500 

Gly He Tyr Arg Tyr Val Ser Thr Gly Glu Arg Ala Ser Gly Met 

1505 1510 1515 

Phe Asp Ser Val Val Leu Cys Glu Cys Tyr Asp Ala Gly Ala Ala 

1520 1525 1530 

Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala 

1535 1540 1545 

Tyr Phe Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu 

1550 1555 1560 

Phe Trp Glu Ala Val Phe Thr Gly Leu Thr His He Asp Ala His 

1565 1570 1575 

Phe Leu Ser Gin Thr Lys Gin Ser Gly Glu Asn Phe Ala Tyr Leu 

1580 1585 1590 

Thr Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Lys Ala Pro Pro 

1595 1600 1605 

Pro Ser Trp Asp Val Het Trp Lys Cys Leu Thr Arg Leu Lys Pro 

1610 1615 1620 

Trp Leu Val Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ser Val 

1625 1630 1635 

Thr Asn Glu Val Thr Leu Thr His Pro Val Thr Lys Tyr He Ala 

1640 1645 1650 

Thr Cys Het Gin Ala Asp Leu Glu Val Het Thr Ser Thr Trp Val 

1655 1660 1665 

Leu Ala Gly Gly Val Leu Ala Ala Val Ala Ala Tyr Cys Leu Ala 

1670 1675 1680 

Thr Gly Cys Val Cys He He Gly Arg Leu His Val Asn Gin Arg 
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1685 1690 1695 

Ala Val Val Ala Pro Asp Lys Glu val Leu Tyr Glu Ala Phe Asp 

1700 1705 1710 

Glu Met Glu Glu Cys Ala Ser Arg Ala Ala Leu He Glu Glu Gly 

1715 1720 1725 

Gin Arg He Ala Glu Met Leu Lys Ser Lys He Gin Gly Leu Leu 

1730 1735 1740. 

Gin Gin Ala Ser Lys Gin Ala Gin Asp He Gin Pro Ala Val Gin 

1745 1750 1755 

Ala Ser Trp Pro Lys Val Glu Gin Phe Trp Ala Lys His Met Trp 

1760 1765 1770 

Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu 

1775 1780 1785 

Pro Gly Asn Pro Ala Val Ala Ser Met Het Ala Phe Ser Ala Ala 

1790 1795 1800 

Leu Thr Ser Pro Leu Ser Thr Ser Thr Thr He Leu Leu Asn He 

1805 1810 1615 
Leu Gly Gly Trp Leu Ala Ser Gin He Ala Pro Pro Ala Gly Ala 

1820 1825 1830 
Thr Gly Phe Val Val Ser Gly Leu Val Gly Ala Ala Val Gly Ser 

1835 1840 1845 

He Gly Leu Gly Lys Val Leu Val Asp He Leu Ala Gly Tyr Gly 

1850 1855 1860 

Ala Gly He Ser Gly Ala Leu Val Ala Phe Lys He Het Ser Gly 

1865 1870 1875 
Glu Lys Pro Ser Het Glu Asp Val Val Asn Leu Leu Pro Gly He 

1880 1885 1890 
Leu Ser Pro Gly Ala Leu Val Val Gly Val He Cys Ala Ala He 

1895 1900 1905 
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Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met 

1910 1915 1920 

Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His Val Ala Pro 

1925 1930 1935 

Thr His Tyr Val Thr Glu Ser Asp Ala Ser Gin Arg Val Thr Gin 

1940 1945 1950 

Leu Leu Gly Ser Leu Thr He Thr Ser Leu Leu Arg Arg Leu His 

1955 1960 1965 

Asn Trp He Thr Glu Asp Cys Pro He Pro Cys Ser Gly Ser Trp 

1970 1975 1980 

Leu Arg Asp Val Trp Asp Trp Val Cys Thr He Leu Thr Asp Phe 

1985 1990 1995 

Lys Asn Trp Leu Thr Ser Lys Leu Phe Pro Lys Met Pro Gly Leu 

2000 2005 2010 

Pro Phe He Ser Cys Gin Lys Gly Tyr Lys Gly Val Trp Ala Gly 

2015 2020 2025 

Thr Gly He Het Thr Thr Arg Cys Pro Cys Gly Ala Asn He Ser 

2030 2035 2040 

Gly Asn Val Arg Leu Gly Ser Met Arg He Thr Gly Pro Lys Thr 

2045 2050 2055 

Cys Met Asn He Trp Gin Gly Thr Phe Pro He Asn Cys Tyr Thr 

2060 2065 2070 

Glu Gly Gin Cys Val Pro Lys Pro Ala Pro Asn Phe Lys He Ala 

2075 2080 2085 

He Trp Arg Val Ala Ala Ser Glu Tyr Ala Glu Val Thr Gin His 

2090 2095 2100 

Gly Ser Tyr His Tyr He Thr Gly Leu Thr Thr Asp Asn Leu Lys 

2105 2110 2115 

Val Pro Cys Gin Leu Pro Ser Pro Glu Phe Phe Ser Trp Val Asp 
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2120 2125 2130 

Gly val Gin He His Arg Phe Ala Pro He Pro Lys Pro Phe Phe 
2135 2140 2145 

Arg Asp Glu Val Ser Phe Cys Val Gly Leu Asn Ser Phe Val Val 
2150 2155 2160 

Gly Ser Gin Leu Pro Cys Asp Pro Glu Pro Asp Thr Asp Val Leu 
2165 2170 2175 

Thr Ser Het Leu Thr Asp Pro Ser His He Thr Ala Glu Thr Ala 
2180 2185 2190 

Ala Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Glu Ala Ser Ser 
2195 2200 2205 

Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Arg Ala Thr Cys Thr 
2210 2215 2220 

Thr His Gly Lys Ala Tyr Asp Val Asp Het Val Asp Ala Asn Leu 
2225 2230 2235 

Phe Het Gly Gly Asp Val Thr Arg He Glu Ser Glu Ser Lys Val 
2240 2245 2250 

Val Val Leu Asp Ser Leu Asp Pro Het Val Glu Glu Arg Ser Asp 
2255 2260 2265 

leu Glu Pro Ser He Pro Ser Glu Tyr Het Leu Pro Lys Lys Arg 
2270 2275 2280 

Phe Pro Pro Ala Leu Pro Ala Trp Ala Arg Pro Asp Tyr Asn Pro 
2285 2290 2295 

Pro Leu Val Glu Ser Trp Lys Arg Pro Asp Tyr Gin Pro Ala Thr 
2300 2305 2310 

Val Ala Gly Cys Ala Leu Pro Pro Pro Lys Lys Thr Pro Thr Pro 
2315 2320 2325 

Pro Pro Arg Arg Arg Arg Thr Val Gly Leu Ser Glu Ser Ser lie 
2330 2335 2340 
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Ala Asp Ala Leu Gin Gin Leu Ala He Lys Ser Phe Gly Gin Pro 

2345 2350 2355 

Pro Pro Ser Gly Asp Ser Gly Leu Ser Thr Gly Ala Asp Ala Ala 

2360 2365 2370 

Asp Ser Gly Ser Arg Thr Pro Pro Asp Glu Leu Ala Leu Ser Glu 

2 3 75 2380 2385 

Thr Gly Ser He Ser Ser Het Pro Pro Leu Glu Gly Glu Pro Gly 

2390 2395 2400 

Asp Pro Asp Leu Glu Pro Glu Gin Val Glu Leu Gin Pro Pro Pro 

2405 2410 2415 

Gin Gly Gly Val Val Thr Pro Gly Ser Gly Ser Gly Ser Trp Ser 

2420 2425 2430 

Thr Cys Ser Glu Glu Asp Asp Ser Val Val Cys Cys Ser Het Ser 

2435 2440 _ 2445 

Tyr Ser Trp Thr Gly Ala Leu He Thr Pro Cys Ser Pro Glu Glu 

2450 2455 2460 

Glu Lys Leu Pro He Asn Pro Leu Ser Asn Ser Leu Leu Arg Tyr 

2465 2470 2475 

His Asn Lys Val Tyr Cys Thr Thr Ser Lys Ser Ala Ser Leu Arg 
. 2480 2485 2490 

Ala Lys Lys Val Thr Phe Asp Arg Het Gin Ala Leu Asp Ala His 

2495 2500 2505 

Tyr Asp Ser Val Leu Lys Asp He Lys Leu Ala Ala Ser Lys Val 

2510 2515 2520 

Thr Ala Arg Leu Leu Thr Leu Glu Glu Ala Cys Gin Leu Thr Pro 

2525 2530 2535 

Pro His Ser Ala Arg Ser Lys Tyr Gly Phe Gly Ala Lys Glu Val 

2540 2545 2550 
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Arg Ser Leu Ser Gly Arg Ala Val Asn His Me Lys Ser Val Trp 

2555 2560 2565 

Lys Asp Leu Leu Glu Asp Thr Gin Thr Pro He Pro Thr Thr He 

2570 2575 2580 

Het Ala Lys Asn Glu Val Phe Cys Val Asp Pro Thr Lys Gly Gly 

2585 2590 2595 

Lys Lys Ala Ala Arg Leu He Val Tyr Pro Asp Leu Gly Val Arg 

2600 2605 2610 

Val Cys Glu Lys Het Ala Leu Tyr Asp He Thr Gin Lys Leu Pro 

2615 2620 2625 

Gin Ala Val Het Gly Ala Ser Tyr Gly Phe Gin Tyr Ser Pro Ala 

2630 2635 2640 

Gin Arg Val Glu Phe Leu Leu Lys Ala Trp Ala Glu Lys Lys Asp 

2645 2650 2655 

Pro Het Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val 

2660 2665 2670 

Thr Glu Arg Asp He Arg Thr Glu Glu Ser. lie Tyr Arg Ala Cys 

2675 2680 2685 

Ser Leu Pro Glu Glu Ala His Thr Ala lie His Ser Leu Thr Glu 

2690 2695 2700 

Arg Leu Tyr Val Gly Gly Pro Het Phe Asn Ser Lys Gly Gin Thr 

2705 2710 2715 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser 

2720 2725 2730 

Het Gly Asn Thr He Thr Cys Tyr Val Lys Ala Leu Ala Ala Cys 

2735 2740 2745 

Lys Ala Ala Gly He lie Ala Pro Thr Het Leu Val Cys Gly Asp 

2750 2755 2760 
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Asp Leu Val Val He Ser Glu Ser Gin Gly Thr Glu Glu Asp Glu 

2765 2770 2775 

Arg Asn Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala 

2780 2785 2790 

Pro Pro Gly Asp Pro Pro Arg Pro Glu Tyr Asp Leu Glu Leu He 

2795 2800 2805 

Thr Ser Cys Ser Ser Asn Val Ser Val Ala Leu Gly Pro Gin Gly 

2810 2815 2820 

Arg Arg Arg Tyr Tyr Leu Thr Arg Asp Pro Thr Thr Pro He Ala 

2825 2830 2835 

Arg Ala Ala Trp Glu Thr Val Arg His Ser Pro Val Asn Ser Trp 

2840 2845 2850 

Leu Gly Asn lie He Gin Tyr Ala Pro Thr He Trp Ala Arg Het 

2855 2860 2865 

Val Leu Het Thr His Phe Phe Ser He Leu Met Ala Gin Asp Thr 

2870 2875 2880 

Leu Asp Gin Asn Leu Asn Phe Glu Het Tyr Gly Ala Val Tyr Ser 

2885 2890 2895 

Val Ser Pro Leu Asp Leu Pro Ala He He Glu Arg Leu His Gly 

2900 2905 2910 

Leu Asp Ala Phe Ser Leu His Thr Tyr Thr Pro His Glu Leu Thr 

2915 2920 2925 

Arg Val Ala Ser Ala Leu Arg Lys Leu Gly Ala Pro Pro Leu Arg 

2930 2935 2940 

Ala Trp Lys Ser Arg Ala Arg Ala Val Arg Ala Ser Leu He Ser 

2945 2950 2955 

Arg Gly Gly Arg Ala Ala Val Cys Gly Arg Tyr Leu Phe Asn Trp 

2960 2965 2970 
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Ala Val Lys Thr Lys Leu Lys Leu Thr Pro Leu Pro Glu Ala Arg 

2975 2980 2985 

Leu Leu Asp Leu Ser Ser Trp Phe Thr Val Gly Ala Gly Gly Gly 

2990 2995 3000 

Asp He Tyr His Ser Val Ser Arg Ala Arg Pro Arg Leu Leu Leu 

3005 3010 3015 

Leu Gly Leu Leu Leu Leu Phe Val Gly Val Gly Leu Phe Leu Leu 

3020 3025 3030 

Pro Ala Arg 
3033 
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Sequence ID No.fc 

Sequence Length: 9,51 1 

Sequence Type: nucleic acid 

Strandedness: single 

Topology: linear 

Molecule Type: genomic RNA 

Method for Determination of Feature: E 



GCCCGCCCCC UGAUGGGGGC GACACUCCGC 
UCUUCACGCA GAAAGCGUCU AGCCAUGGCG 
CCCCCCUCCC GGGAGAGCCA UAGUGGUCUG 
AAAGACUGGG UCCUUUCUUG GAUAAACCCA 
GCAAGACUGC UAGCCGAGUA GCGUUGGGUU 
GURCUUGCGA GUGCCCCGGG AGGUCUCGUA 
CUCAAAGAAA AACCAAAAGA AACACAAACC 
GCGGUCAGAU CGUUGGCGGA GUUUACUUGC 
GCGCGACAAG GAAGACUUCY GAGCGAUCCC 
AAGAUCGGCG CUCCACCGGC AAGUCCUGGG 
GAAACGAGGG UUGCGGCUGG GCGGGUUGGC 
GGGGCCCCAC CGACCCCCGG CAI1AGAUCAC 
CGUGUGGUUU UGCCGACCUC AUGGGGUACA 
UCGCCAGAGC UCUGGCACAC GGDGUUAGGG 
GGAAUUUACC CGGUUGCUCU UUUUCUAUCU 
UGCCAGUGUC UGCAGUGGAA GUCAGGAACA 
GCUCAAACAA CAGCAUCACC UGGCAGCUCA 
UCCCAUGUGA GAAYGAUAAY GGCACCUUGC 
CUGUGAAACA CCGCGGUGCG CUCACUCGUA 



CAUGAAUCAC UCCCCUGUGA GGAACUACUG 60 

UUAGUAUGAG UGUCGUACAG CCUCCAGGCC 120 

CGGAACCGGU GAGUACACGG GAAUUACCGG 180 

CUCUAUGUCC GGUCAUUUGG GCACGCCCCC 240 

GCGAAAGGCC UUGUGGUACU GCCUGAUAGG 300 

GACCGUGCAU CAUGAGCACA AAUCCUAAAC 360 

GCCGCCCACA GGACGUUAAG UUCCCGGGUG 420 

UGCCGCGCAG GGGCCCCAGG UUGGGUGUGC 480 

AGCCGCGUGG ACGACGCCAG CCCAUCCCGA 540 

GAAAGCCAGG AUAUCCUUGG CCCCUGUACG 600 

UCCUGUCCCC CCGCGGGUCU CGUCCUACUU 660 

GCAAUUUGGG CAGAGUCAUC GAUACCAUUA 720 

UCCCUGUCGU UGGCGCCCCG GUYGGAGGCG 780 

UCCUGGAGGA CGGGAUAAAU UACGCAACAG 840 

UUUUGCUUGC UCUUCUGUCA UGCGUCACAR 900 

IIYAGUUCUAG CUACUACGCC ACUAAUGAUU 960 

CUGACGCAGU UCUCCAUCUU CCUGGAUGCG 1020 

RUUGCUGGAU ACAAGUAACA CCCRACGUGG 1080 

GCCUGCGAAC ACACGUCGAC AUGAUCGUAA 1140 
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UGGCAGCUAC GGCCUGCUCG GCCUUGUAUG UGGGAGAUGU GUGCGGGGCC GUGAUGAUYC 1200 
UAUCGCAGGC UUUCAUGGUA UCACCACAAC GCCACAACUU CACCCAAGAG UGCAACUGUU 1260 
CCAUCUACCA AGGUCACAUC ACCGGCCAUC GCAUGGCAUG GGACAUGAUG CURARCUGGU 1320 
CUCCAACUCU URCCAUGAUC CUCGCCUACG CYGCUCGYGU UCCCGARCUG GUCCUCGAAA 1380 
UYAUYUUCGG CGGCCAUUGG GGUGUGGYGU UYGGCUUGGS CUAUUUCUCC AllGCARGGAG 1440 
CGUGGGCCAA AGUCRUYGCC AUCCUCCUUC UUGUUGCGGG AGUGGAUGCA WCCACCUAUU 1500 
CCASCGGYCA GSAAGCGGGU CGURCCGYCK HKGGGWUCKC URGCCOUU AMUACUGGUG 1560 
CCAAGCAGAA CCUCYAUUUR AUCAACACCA AUGGCAGCUG GCACAUAAAC CGGACUGCCC 1620 
UCAAUUGCAA UGACAGCYUA SAGACGGGUU UCHUCGCUUC CYUGKUUUAC WHCCRCARGU 1680 
UCAACAGCUC UGGCUGCCCC GAGCGCUUGU CUUCCUGCCG CGGGCUGGAC GAYUUYCGCA 1740 
UCGGCUGGGG AACCUUGGAA UACGAAACCA ACGUCACCAA CGAUGRGGAC AUGAGGCCGU 1800 
ACUGCUGGCA UUACCCCCCG AGGCCUUGCG GCAUCGUCCC GGCUAGGACG GUUUGCGGAC 1860 
CGGUCUAUUG YUUCACCCCU AGCCCUGUUG UCGUGGGCAC CACUGACAAG CAGGGCGUAC 1920 
CCACCUACAC CUGGGGRGAA AACGAGACCG AUGUCUUCCU GCRAAAUAGC ACAAGACCCC 1980 
CGCGAGGAGC UUGGUUCGGC UGCACYUGGA UGAACGGGAC UGGGUUCACU AAGACAUGCG 2040 
GUGCACCACC UUGCCGCAUU AGGAAAGACU ACAACAGCAC UCUCGAUUUA UUGUGCCCCA 2100 
CAGACUGUUU UAGGAAGCAC CCAGAUGCUA CCUAUCUUAA GUGUGGAGCA GGGCCUUGGU 2160 
UAACUCCCAG GUGCCUGGUA GACUACCCUU AUAGRYUGUG GCAUUAUCCG UGCACUGUAA 2220 
ACUUCACCAU CUUYAAGGCG CGGAUGUAUG (JAGGAGGGGl) GGAGCAUCGA UUCUCCGCAG 2280 
CAUGCAACUU CACGCGCGGA GAUCGCUGCA GACUGGAAGA UAGGGAUAGG GGYCAGCAGA 2340 
GUCCACUGCU GCAUUCCACU ACUGAGUGGG CGGUGYUCCC AUGCUCCUUC UCUGACCUAC 2400 
CAGCACUAUC CACUGGCCUA UUGCACCUCC ACCAAAACAU CGUGGACGUG CAGUACCUYU 2460 
ACGGACUUUC UCCGGCUCUG ACAAGAUACA UCGUGAAGUG GGAGUGGGUG AUCCUCCUUU 2520 
UCUUGUUGUU GGCAGACGCC AGGRUCUGUG CAUGCCUUUG GAUGCUCAHC AUACUGGGCC 2580 
AAGCCGAAGC GGCGCUUGAG AAGCUCAUCA UCUUGCACUC CGCUAGYGCU GCUAGUGCCA 2640 
AUGGUCCGCU GUGGUUUUUC AUCUUCUUUA CAGCGGCCUG GUACUUAAAG GGCAGGGUGG 2700 
UCCCCGUGGC CACGUACUCU GUBCUCGGCU URUGGUCCUU CCUCCUCCUA GUCCUGGCYU 2760^ 
UACCACAGCA GGCUUAUGCC UUGGACGCUG CUGAACAAGG GGAACUGGGG CUGGCCAUAU 2820. 
UAGUAAUUAU AUCCAUCUUU ACUCUUACCC CAGCAUACAA GAUCCUCCUG AGCCGUUCAG 2880 
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UGUGGUGGCU GUCCUACAUG CUGGUCUUGG 
CCCUGGAGGU CCGAGGGGGG CGUGACGGGA 
GCCUUGUGUU UGAGGUCACG AAAUGGUUGU 
RAGCGUCUCU GCUACGGAUA CCGUACUUUG 
CCCUGGUGAA ACACCUCGCR GGGGCUAGGU 
GAUGGACCGG CACUUACAUC UACGACCACC 
GUUURCGGGA CCUGGCAAUC GCCGUGGAGC 
UCAUUGUGUG GGGGGCUGAG ACAGUGGCGU 
CCGCGAGGCl) AGGUAGGGAR GUUCUGCUCG 
GGAAKCUCCU AGCUCCCAUU ACUGCUUACA 
UCGUGGUCAG CCUAACGGGC CGCGACAAAA 
CCUCCGUCAC ACAAACUUUC UUGGGGACAU 
ACGGGGCUGG UAAUAAGACC UUGGCCGGCC 
GCGCAGAAGG GGACCUCGUG GGAUGGCCUA 
GUACCUGCGG GGCCGUAGAC CUCUACCUGG 
GGAGGAAAGA UGACCGACGG GGUGCAUUAC 
GAUCAUCCGG AGGGCCCGUG CUCUGCUCWA 
CCGUGUGUGC CAGGGGUGUA GCCAAAUCUA 
UCGCCACACG GACGCCCAGU UUCUCUGACA 
ACCAGGUGGG UUACUUGCAC GCACCAACAG 
CGUAUGCCAG UCAGGGGUAU AAAGUACUCG 
GUUUUGGGGC CUACAUGUCC AAAGCCCACG 
GGACCGUUAC CACCGGGGAC UCUAUCACUU 
GAGGCUGUGC AGCCGGUGCC UAUGACAUCA 
CJACUACCAU CCUUGGCAUU GGAACAGUCC 
UAGUGGUYUU GGCCACAGCC ACGCCUCCCG 
AGGAGGUGGC CCUUGGUCAC GAGGGCGAGA 
CUUUCAUCAA GGGGGGCAGA CACUUGAUCU 
UCGCAGCGGC CCUCCGGGGC AYGGGUGUCA 



CCGAGGCCCA GAUUCAGCAA UGGGUUCCCC 2940 
UCAUCUGGGU GGCUGUCAUU CUACACCCAC 3000 
UAGCAAUCCU GGGGCCUGCC UACCUCCUUA 3060 
UGAGGGCCCA CGCUUUGCUA CGAGUGUGUA 3120 
ACAUCCAGAU GCUGUURAUC ACCAUAGGCA 3180 
UCUCCCCUUU AUCAACUUGG GCGGCCCAGG 3240 
CUGUGGUGUU CAGCCCAAUG GAGAAGAAGG 3300 
GUGGAGACAlf CCUGCAUGGC CUCCCGGUCU 3360 
GCCCUGCCGA CGGCUACACC UCCAAGGGGU 3420 
CUCAGCAAAC UCGUGGUCUC CUGGGUGCUA 3480 
AUGAGCAGGC UGGGCAGGUC CAGGUUCUGU 3540 
CCAUUUCGGG CGUCCUCUGG ACAGUAUAUC 3600 
CCAAGGGACC AGUCACUCAG AUGUACACCA 3660 
GUCCCCCCGG GACUAAGUCA UUGGACCCCU 3720 
UCACCCGAAA CGCUGAUGUC AUUCCGGUCC 3780 
UCUCGCCAAG GCCCCUCUCA ACCCUCAAAG 3840 
GGGGACACGC CGUGGGCUUG UUCAGAGCGG 3900 
UUGACUUCAU CCCCGUCGAA UCACUCGAUR 3960 
ACAGURCGCC GCCAGCUGUG CCCCAGUCUU 4020 
GCAGCGGAAA GAGCACCAAG GUCCCUGCCG 4080 
UACUAAAUCC CUCUGUCGCG GCCACACUUG 4140 
GGAUCAACCC UAAUAUCAGA ACUGGAGUGC 4200 
ACUCCACUUA UGGCAAGUUU AUCGCAGAUG 4260 
UCAUAUGCGA CGAAUGCCAU UCAGUGGACG 4320 
UUGACCAAGC UGAGACCGCA GGCGUCAGGC 4380 
GUACGGUGAC AACUCCCCAC AGUAACAUAG 4440 
UCCCUUUUUA UGGCAAAGCU AUUCCCCUAG 4500 
UUUGCCAUUC AAAGAAGAAG UGCGACGAGC 4560 
AUGCCGUUGC AUACUAUAGG GGUCUCGACG 4620 
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UCUCCGUUAU ACCAACUCAA GGAGACGUGG UGGUUGUCGC CACUGAUGCC CUAAUGACUG 4680 
GGUACACCGG CGACUUUGAC UCYGUCAUCG ACUGUAAUGU UGCAGUCUCU CAGAUUGUUG 4740 
ACUUCAGCCU AGACCCAACC UUCACCAUCA CCACUCAAAC CGUCCCUCAG GAGGCUGUCU 4800 
CCCGUAGUCA ACGUAGAGGG AGAACUGGGA GGGGGCGAUU GGGCRUUUAC AGGUAUGUUU 4860 
CGUCAGGYGA RRGGCCGUCU GGGAUGUUCG ACAGCGUAGU GCYCUGCGAG UGCUAUGAUG 4920 
CCGGGGCAGC CUGGUACGAG CUUACACCUG CUGAGACUAC GGUGAGACUC CGGGCYUAUU 4980 
UCAACACGCC CGGUUUGCCC GUAUGUCAAG ACCACCUGGA GUUCUGGGAA GCGGUCUUUA 5040 
CAGGUCUCAC WCACAUURAC GCCCACUUCC UCUCCCAGAC GAAGCAAGGA GGAGAAAACU 5100 
UUGCRUAUCU AACGGCCUAC CAGGCCACAG UAUGCGCCAG GGCAAAGGCC CCUCCUCCUU 5160 
CGUGGGACGU GAUGUGGAAG UGUCUAACUA GGCUCAAACC UACACUGACU GGUCCCACCC 5220 
CCCUCCUGUA CCGCUUGGGU GCCGUGACCA AUGAGGUYAC CUUGACGCAC CCCGUGACGA 5280 
AAUACAUCGC CACGUGCAUG CAAGCUGACC UYGAGAUCAU GACAAGCUCA UGGGUCCUGG 5340 
CGGGGGGGGU GCUAGCCGCC GUGGCAGCUU ACUGCCUGGC GACUGGCUGC AUUUCCAUCA 5400 
UUGGCCGCCU ACACCUGAAU GAUCGGGUGG UUGUGRCCCC YGACAAGGAR AUCUUAUAUG 5460 
AGGCCUUUGA UGAGAUGGAA GAAUGCGCCU CCAAAGCCGC CCliCAUUGAG GAAGGGCAGC 5520 
GGAUGGCGGA GAUGCUCAAA UCUAAGAUAC AAGGCCUCCU ACAACAGGCC ACAAGGCAAG 5580 
CUCAAGRCAU RCAGCCAGCU AUACAGUCAU CAUGGCCCAA GCUUGAACAA UUUUGGGCCA 5640 
AACACAUGUG GAACUUCAUC AG.UGGUAUAC AGUACCUAGC AGGACUCUCC ACCCUACCGG 5700 
GAAAUCCUGC AGURGCAUCA AUGAUGGCUU UUAGCGCCGC GCUGACUAGC CCACUACCCA 5760 
CCAGCACCAC CAUCCUCUUG AACAUCAUGG GAGGAUGCUU GGCCUCYCAG AUUGCCCCCC 5820 
CUGCCGGAGC CACYGGCUUC GUUGUCAGUG GUCUAGUGGG GGCGGCCGUC GGAAGCAUAG 5880 
GCCUGGGUAA GAUACUGGUG GACGUUUUGG CCGGGUACGG CGCAGGCAUU UCAGGGGCCC 5940 
UCGUAGCUUU UAAGAUCAUG AGCGGCGAGA AGCCCACGGU AGAAGACGUU GUGAAUCUCC 6000 
UGCCUGCUAU YCUGUCUCCU GGUGCGYUGG UAGUGGGAGU CAUCUGUGCA GCAAUYCUGC 6060 
GCCGCCACGU CGGUCAGGGA GAGGGRGCGG UCCAGUGGAU GAACAGACUG AUCGCCUUCG 6120 
CCUCCAGGGG AAACCACGUU GCCCCUACCC ACUACGUGGU GGAGUCUGAC GCUUCACAGC 6180 
GUGURACGCA GGUGCUGAGU UCACUUACAA UUACCAGCUU ACUUAGGAGA CUACAUGCCU 6240 
GGAUCACUGA AGAUUGCCCA RUCCCAUGCU CGGGGUCUUG GCUCCAGGAC AUUUGGGAUU 6300 
GGGUUUGUUC CAUCCUCACA GACUUYAAAA ACUGGCUGUC UUCAAAAUUA CUCCCCAAGA 6360 
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UGCCCGGCAU UCCCUUUAUC UCUUGCCAGA 
GUGUCAUGAC YACUCGRURC CCAUGUGGAG 
CCAUGAAAAU AACAGGCCCG AAGACUUGCU 
AUUGUUACAC AGAAGGGCCY UGCGUGCCAA 
GGAGGGUGGC AGCGUCGGAG UACGUUGAGG 
CRGGGUUAAC CAGUGACAAC CUUAAGGUYC 
CUUGGGUGGA CGGGGUGCAA AUCCACCGAU 
AUGAGGUAAC GUUCACCGUA GGCCUUAACU 
AUCCUGAGCC GGACACCGAR GUACUGGCCU 
CKGAGGCGGC AGCCAGGCGA UUGGCAAGGG 
CGAGCCAGCU CUCUGCCCCG UCCUUGAAGG 
AUUGUGACAU GGUGGAUGCY AACCUUUUCA 
ACUCUAAGGU GAUCGUUCUA GACUCCCUCG 
AGCCUUCUGU ACCAUCAGAG UACCUGAUCA 
CUUGGGCCCG UCCAGACUAC AAUCCUGUUU 
AACCACCCAC UGUCCUAGGC UGUGCCCUCC 
CUCGGAGGCG CCGCGCYAAA RUCCUGACCC 
UGGCUGACAA AGURCUCAGC CCUCUCCAAG 
CGGAUACCGG AGGAGACAUC GUCCAGCAAC 
GGUCACUGUC CUCCAUGCCU CCCCUUGAGG 
AACCAGUGGG AUCCGCUCCC CCUUCUGAGG 
AGUCGUGGUC CACAGUCUCU GAUCAAGAGG 
CCUGGACGGG GGCCCUCAUA ACACCAUGUG 
CUCUGAGUAA UUCGCUCAUG CGGUUCCAUA 
CCUCUCUGAG GGCAAAGAAG GUGACUUUUG 
ACUCAGUCUU GCAGGACGUU AAGCGGGCCG 
UAGAGGAAGC CUGCGCGCUG ACCCCGCCCC 
CAAAAGAGGU GCGCAGCUUA UCCAGGAGGG 
ACCUCCUGGA AGACCAACRU ACCCCAAUUG 



AGGGAUACAA GGGUGUAUGG GCUGGUACGG 6420 
CAAACAOC GGGCCAUGUC CGCAUGGGCA 6480 
UGAACCUGUG GCAGGGGACU UUCCCCAUUA 6540 
AACCCCCUCC UAAUUACAAG ACCGCAAUUU 6600 
UCACACAGCA UGGCUCUUUC UCGUAUGUAA 6660 
CUUGCCAGGU ACCAGCUCCA GAAUUUUUCU 6720 
UCGCCCCCGU WCCAGGUCCC UUCUUUCGGG 6780 
CCUUCGUGGU CGGCUCUCAG CUCCCUUGCG 6840 
CYAUGUUGAC AGACCCGUCC CACAUCACCG 6900 
GAUCUCCCCC YUCACAGGCU AGCUCCUCAG 6960 
CUACCUGUAC CACCCAUAAG ACAGCAUAUG 7020 
UGGGAGGHGA UGUGAYCCGG AUUGAGUCUG 7080 
AUUCCAUGAC UGAGGUAGAG GAUGAUCGUG 7140 
AGAGGAGAAA GUUCCCACCG GCGCUGCCUC 7200 
UGAUCGAGAC AUGGAAGAGG CCGGGCUAUG 7260 
CCCCCACACY UCAAACGCCA GUGCCUCCAC 7320 
AGGACRAUGU GGAGGGGRUC CUCAGGGAGA 7380 
ACAACAAUGA CUCCGGUCAC UCCACUGGAG 7440 
CCUCUGACGA GACUGCCGCU UCAGAAGCGG 7500 
GAGAGCCGGG AGACCCYGAC CUGGAGUUUG 7560 
GGGAGUGUGA GGUCAUUGAU UCGGACUCUA 7620 
AUUCUGUUAU CUGCUGCUCU AUGUCAUACU 7680 
GGCCCGAAGA GGAGAAGUUA CCGAUCAACC 7740 
AYAAGGUGUA CUCCACAACC UCGAGGAGUG 7800 
ACAGGGUGCA GGUGCUGGAC GCACACUAUG 7860 
CCUCUAAGGU URGUGCGAGG CUCCUCACAG 7920 
ACUCCGCCAA AUCGCGAUAC GGAUUUGGGG 7980 
CCGUUAACCA CAUCCGGUCC GUGUGGGAGG 8040 
ACACAACUAU CAUGGCUAAA AAUGAGGUGU 8100 
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UCUGCAUUGA UCCAACUAAR GGUGGGAAAA AGCCAGCUCG CCUCAUCGUA UACCCCGACC 8160 
UUGGGGUCAG GGUGUGCGAA AAGAUGGCCC UCUAUGACAU CRCACAAAAG CUUCCCAAAG 8220 
CGAUAAUGGG GCCAUCCUAU GGGUUCCAAU ACUCUCCCGC AGAACGGGUC GAUUUCCUCC 8280 
UCAAAGCUUG GGGAAGUAAG AAGGACCCAA UGGGGUUCUC GUAUGACACC CGCUGCUUUG 8340 
ACUCAACCGU CACGGAGAGG GACAUAAGAA CAGAAGAAUC CAUAUAUCAG GCUUGUUCUC 8400 
UGCCUCAAGA AGCCAGAACU GUCAUACACU CGCUCACUGA GAGACUUUAC GUAGGAGGGC 8460 
CCA UG AC AAA CAGCAAAGGG CAAUCCUGCG GCUACAGGCG UUGCCGCGCA AGCGGKGUUU 8520 
UCACCACCAG CAUGGGGAAU ACCAUGACAU GUUACAUCAA AGCCCUUGCA GCGUGUAAGG 8580 
CUGCRGGGAU CGUGGACCCU GUUAUGUUGG UGUGUGGAGA CGACCUGGUC GUCAUCUCAG 8640 
AGAGCCAAGG (JAACGAGGAG GACGAGCGAA ACCUGAGAGC UUUCACGGAG GCUAUGACCA 8700 
GGUAUUCCGC CCCUCCCGGU GACCUUCCCA GACCGGAAUA UGACUUGGAG CUUAUAACAU 8760 
CCUGCUCCUC AAACGUAUCG GUAGCGCUGG ACUCUCGGGG UCGCCGCCGG UACUUCCUAA 8820 
CCAGAGACCC UACCACUCCA AUCACCCGAG CUGCUUGGGA AACAGUAAGA CACUCCCCUG 8880 
UCAAUUCUUG GCUGGGCAAC AUCAUCCAGU ACGCCCCCAC AAUCUGGGUC CGGAUGGUCA 8940 
UAAUGACUCA CUUCUUCUCC AUACUAUUGG CCCAGGACAC UCUGAACCAA AAUCUCAAUU 9000 
UUGAGAUGUA CGGGGCAGUA UACUCGGUCA AUCCAUUAGA CCUACCGGCC AUAAUUGAAA 9060 
GGCUACAUGG GCUUGAAGCC UUUUCACUGC ACACAUACUC UCCCCACGAA CUCUCACGGG 9120 
UGGCAGCAAC UCUCAGAAAA CUUGGAGCGC CUCCCCUUAG AGCGUGGAAG AGUCGGGCGC 9180 
GUGCCGUGAG AGCUUCACUC AUCGCCCAAG GAGCGAGGGC GGCCAUUUGU GGCCGCUACC 9240 
UCUUCAACUG GGCGGUGAAA ACAAAGCUCA AACUCACUCC AUUGCCCGAG GCGAGCCGCC 9300 
UGGAUUUAUC CGGGUGGUUC ACCGUGGGCG CCGGCGGGGG CGACAl/UUAU CACAGCGUGU 9360 
CGCAUGCYCG ACCCCGCCUA UUACUCCUUU GCCUACUCCU ACUUAGCGUA GGAGUAGGCA 9420 
UCUUUUUACU CCCCGCUCGG UAGAGCGGCA AACYCUAGCU ACACUCCAUA GCUAGUUUCC 9480 
GUUMUUU UUUUUUUUUU UUUUUUUUUU U 951 1 
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Sequence ED No. 7 
Sequence Length: 9,5 1 1 
Sequence Type: nucleic acid 
Strandedness: single 
Topology: linear 

Molecule Type: cDNA to genomic RNA 
Method for Determination of Feature: E 



GCCCGCCCCC TGATGGGGGC GACACTCCGC CATGAATCAC TCCCCTGTGA GGAACTACTG 60 

TCTTCACGCA GAAAGCGTCT AGCCATGGCG TTAGTATGAG TGTCGTACAG CCTCCAGGCC 120 

CCCCCCTCCC GGGAGAGCCA TAGTGGTCTG CGGAACCGGT GAGTACACCG GAATTACCGG 180 

AAAGACTGGG TCCTTTCTTG GATAAACCCA CTCTATGTCC GGTCATTTGG GCACGCCCCC 240 

GCAAGACTGC TAGCCGAGTA GCGTTGGGTT GCGAAAGGCC TTGTGGTACT GCCTGATAGG 300 

GTRCTTGCGA GTGCCCCGGG AGGTCTCGTA GACCGTGCAT CATGAGCACA AATCCTAAAC 360 

CTCAAAGAAA AACCAAAAGA AACACAAACC GCCGCCCACA GGACGTTAAG TTCCCGGGTG 420 

GCGGTCAGAT CGTTGGCGGA GTTTACTTGC TGCCGCGCAG GGGCCCCAGG TTGGGTGTGC 480 

GCGCGACAAG GAAGACTTCY GAGCGATCCC AGCCGCGTGG ACGACGCCAG CCCATCCCGA 540 

AAGATCGGCG CTCCACCGGC AAGTCCTGGG GAAAGCCAGG ATATCCTTGG CCCCTGTACG 600 

GAAACGAGGG TTGCGGCTGG GCGGGTTGGC TCCTGTCCCC CCGCGGGTCT CGTCCTACTT 660 

GGGGCCCCAC CGACCCCCGG CATAGATCAC GCAATTTGGG CAGAGTCATC GATACCATTA 720 

CGTGTGGTTT TGCCGACCTC ATGGGGTACA TCCCTGTCGT TGGCGCCCCG GTYGGAGGCG 780 

TCGCCAGAGC TCTGGCACAC GGTGTTAGGG TCCTGGAGGA CGGGATAAAT TACGCAACAG 840 

GGAATTTACC CGGTTGCTCT TTTTCTATCT TTTTGCTTGC TCTTCTGTCA TGCGTCACAR 900 

TGCCAGTGTC TGCAGTGGAA GTCAGGAACA TYAGTTCTAG CTACTACGCC ACTAATGATT 960 

GCTCAAACAA CAGCATCACC TGGCAGCTCA CTGACGCAGT TCTCCATCTT CCTGGATGCG 1020 

TCCCATGTGA GAAYGATAAY GGCACCTTGC RTTGCTGGAT ACAAGTAACA CCCRACGTGG 1080 

CTGTGAAACA CCGCGGTGCG CTCACTCGTA GCCTGCGAAC ACACGTCGAC ATGATCGTAA 1140 

TGGCAGCTAC GGCCTGCTCG GCCTTGTATG TGGGAGATGT GTGCGGGGCC GTGATGATYC 1200 
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TATCGCAGGC TTTCATGGTA TCACCACAAC GCCACAACTT CACCCAAGAG TGCAACTGTT 1260 
CCATCTACCA AGGTCACATC ACCGGCCATC GCATGGCATG GGACATGATG CTRARCTGGT 1320 
CTCCAACTCT TRCCATGATC CTCGCCTACG CYGCTCGYGT TCCCGARCTG GTCCTCGAAA 1380 
TYATYTTCGG CGGCCATTGG GGTGTGGYGT TYGGCTTGGS CTATTTCTCC ATGCARGGAG 1440 
CGTGGGCCAA AGTCRTYGCC ATCCTCCTTC TTGTTGCGGG AGTGGATGCA WCCACCTATT 1500 
CCASCGGYCA GSAAGCGGGT CGTRCCGYCK HKGGGWTCKC TRGCCTCTTT AHTACTGGTG 1560 
CCAAGCAGAA CCTCYATTTR ATCAACACCA ATGGCAGCTG GCACATAAAC CGGACTGCCC 1620 
TCAATTGCAA TGACAGCYTA SAGACGGGTT TCHTCGCTTC CYTGKTTTAC WHCCRCARGT 1680 
TCAACAGCTC TGGCTGCCCC GAGCGCTTGT CTTCCTGCCG CGGGCTGGAC GAYTTYCGCA 1740 
TCGGCTGGGG AACCTTGGAA TACGAAACCA ACGTCACCAA CGATGRGGAC ATGAGGCCGT 1800 
ACTGC7GGCA TTACCCCCCG AGGCCTTGCG GCATCGTCCC GGCTAGGACG GTTTGCGGAC 1860 
CGGTCTATTG YTTCACCCCT AGCCCTGTTG TCGTGGGCAC CACTGACAAG CAGGGCGTAC 1920 
CCACCTACAC CTGGGGRGAA AACGAGACCG ATGTCTTCCT GCTRAATAGC ACAAGACCCC 1980 
CGCGAGGAGC TTGGTTCGGC TGCACYTGGA TGAACGGGAC TGGGTTCACT AAGACATGCG 2040 
GTGCACCACC TTGCCGCATT AGGAAAGACT ACAACAGCAC TCTCGATTTA TTGTGCCCCA 2100 
CAGACTGTTT TAGGAAGCAC CCAGATGCTA CCTATCTTAA GTGTGGAGCA GGGCCTTGGT 2160 
TAACTCCCAG GTGCCTGGTA GACTACCCTT ATAGRYTGTG GCATTATCCG TGCACTGTAA 2220 
ACTTCACCAT CTTYAAGGCG CGGATGTATG TAGGAGGGGT GGAGCATCGA TTCTCCGCAG 2280 
CATGCAACTT CACGCGCGGA GATCGCTGCA GACTGGAAGA TAGGGATAGG GGYCAGCAGA 2340 
GTCCACTGCT GCATTCCACT ACTGAGTGGG CGGTGYTCCC ATGCTCCTTC TCTGACCTAC 2400 
CAGCACTATC CACTGGCCTA TTGCACCTCC ACCAAAACAT CGTGGACGTG CAGTACCTYT 2460 
ACGGACTTTC TCCGGCTCTG ACAAGATACA TCGTGAAGTG GGAGTGGGTG ATCCTCCTTT 2520 
TCTTGTTGTT GGCAGACGCC AGGRTCTGTG CATGCCTTTG GATGCTCAWC ATACTGGGCC 2580 
AAGCCGAAGC GGCGCTTGAG AAGCTCATCA TCTTGCACTC CGCTAGYGCT GCTAGTGCCA 2640 
ATGGTCCGCT GTGGTTTTTC ATCTTCTTTA CAGCGGCCTG GTACTTAAAG GGCAGGGTGG 2700 
TCCCCGTGGC CACGTACTCT GTBCTCGGCT TRTGGTCCTT CCTCCTCCTA GTCCTGGCYT 2760 
TACCACAGCA GGCTTATGCC TTGGACGCTG CTGAACAAGG GGAACTGGGG CTGGCCATAT 2820 
TAGTAATTAT ATCCATCTTT ACTCTTACCC CAGCATACAA GATCCTCCTG AGCCGTTCAG 2880 
TGTGGTGGCT GTCCTACATG CTGGTGTTGG CCGAGGCCCA GATTCAGCAA TGGGTTCCCC 2940 
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CCCTGGAGGT CCGAGGGGGG CGTGACGGGA 
GCCTTGTGTT TGAGGTCACG AAATGGTTGT 
RAGCGTCTCT GCTACGGATA CCGTACTTTG 
CCCTGGTGAA ACACCTCGCR GGGGCTAGGT 
GATGGACCGG CACTTACATC TACGACCACC 
GTTTRCGGGA CCTGGCAATC GCCGTGGAGC 
TCATTGTGTG GGGGGCTGAG ACAGTGGCGT 
CCGCGAGGCT AGGTAGGGAR GTTCTGCTCG 
GGAAKCTCCT AGCTCCCATT ACTGCTTACA 
TCGTGGTCAG CCTAACGGGC CGCGACAAAA 
CCTCCGTCAC ACAAACTTTC TTGGGGACAT 
ACGGGGCTGG TAATAAGACC TTGGCCGGCC 
GCGCAGAAGG GGACCTCGTG GGATGGCCTA 
GTACCTGCGG GGCCGTAGAC CTCTACCTGG 
GGAGGAAAGA TGACCGACGG GGTGCATTAC 
GATCATCCGG AGGGCCCGTG CTCTGCTCWA 
CCGTGTGTGC CAGGGGTGTA GCCAAATCTA 
TCGCCACACG GACGCCCAGT TTCTCTGACA 
ACCAGGTGGG TTACTTGCAC GCACCAACAG 
CGTATGCCAG TCAGGGGTAT AAAGTACTCG 
GTTTTGGGGC CTACATGTCC AAAGCCCACG 
GGACCGTTAC CACCGGGGAC TCTATCACTT 
GAGGCTGTGC AGCCGGTGCC TATGACATCA 
CTACTACCAT CCTTGGCATT GGAACAGTCC 
TAGTGGTYTT GGCCACAGCC ACGCCTCCCG 
AGGAGGTGGC CCTTGGTCAC GAGGGCGAGA 
CTTTCATCAA GGGGGGCAGA CACTTGATCT 
TCGCAGCGGC CCTCCGGGGC AYGGGTGTCA 
TCTCCGTTAT ACCAACTCAA GGAGACGTGG 



TCATCTGGGT GGCTGTCATT CTACACCCAC 3000 
TAGCAATCCT GGGGCCTGCC TACCTCCTTA 3060 
TGAGGGCCCA CGCTTTGCTA CGAGTGTGTA 3120 
ACATCCAGAT GCTGTTRATC ACCATAGGCA 3180 
TCTCCCCTTT ATCAACTTGG GCGGCCCAGG 3240 
CTGTGGTGTT CAGCCCAATG GAGAAGAAGG 3300 
GTGGAGACAT CCTGCATGGC CTCCCGGTCT 3360 
GCCCTGCCGA CGGCTACACC TCCAAGGGGT 3420 
CTCAGCAAAC TCGTGGTCTC CTGGGTGCTA 3480 
ATGAGCAGGC TGGGCAGGTC CAGGTTCTGT 3540 
CCATTTCGGG CGTCCTCTGG ACAGTATATC 3600 
CCAAGGGACC AGTCACTCAG ATGTACACCA 3660 
GTCCCCCCGG GACTAAGTCA TTGGACCCCT 3720 
TCACCCGAAA CGCTGATGTC ATTCCGGTCC 3780 
TCTCGCCAAG GCCCCTCTCA ACCCTCAAAG 3840 
GGGGACACGC CGTGGGCTTG TTCAGAGCGG 3900 
TTGACTTCAT CCCCGTCGAA TCACTCGATR 3960 
ACAGTRCGCC GCCAGCTGTG CCCCAGTCTT 4020 
GCAGCGGAAA GAGCACCAAG GTCCCTGCCG 4080 
TACTAAATCC CTCTGTCGCG GCCACACTTG 4140 
GGATCAACCC TAATATCAGA ACTGGAGTGC 4200 
ACTCCACTTA TGGCAAGTTT ATCGCAGATG 4260 
TCATATGCGA CGAATGCCAT TCAGTGGACG 4320 
TTGACCAAGC TGAGACCGCA GGCGTCAGGC 4380 
GTACGGTGAC AACTCCCCAC AGTAACATAG 4440 
TCCCTTTTTA TGGCAAAGCT ATTCCCCTAG 4500 
TTTGCCATTC AAAGAAGAAG TGCGACGAGC 4560 
ATGCCGTTGC ATACTATAGG GGTCTCGACG 4620 
TGGTTGTCGC CACTGATGCC CTAATGACTG 4680 
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GGTACACCGG 


CGACTTTGAC 


TA»JATAA taa 

TCYGTCATCG 


A A T A T A 1 T AT 

ACTGTAATGT 


TAA A ATATAT 

TGCAGTCTCT 


A A A A TTATTA 

CAGATTGTTG 


4740 


A A T T A A A A A T 

ACTTCAGCCT 


A A A A A A A A AA 

AGACCCAACC 


T T A A A A A XA A 

TTCACCATCA 


A A A A T A A A AA 

CCACTCAAAC 


AATAAATA A A 

CGTCCCTCAG 


A A A A ATATAT 

GACGCTGTCT 


4800 


A A A AT A ATA A 

CCCGTAGTCA 


AAATAAAAAA 

ACGTAGAGGG 


AAA A A TA A A A 

AGAACTGGGA 


AAAAAAA A TT 

GGGGGCGATT 


A A A AftTTT A A 

GGGCRTTTAC 


A AAT A T ATTT 

AGGTATGTTT 


4860 


AATAA A/MIA J 

CGTCAGGYGA 


A A A A A A AT A T 

RRGGCCGTCT 


AAA A T AT T A A 

GGGATGTTCG 


A A A AAA T A AT 

ACAGCGTAGT 


AAWATAAA A A 

GCYCTGCGAG 


TAATATAA TA 

TGCTATGATG 


4920 


A A A AA A A A AA 

CCGGGGCAGC 


ATAATA /\A A A 

CTGGTACGAG 


ATT A A A A AT A 

CTTACACCTG 


ATAA A A AT A A 

CTGAGACTAC 


A AT A A A A ATA 

GGTGAGACTC 


AAAAAWT A TT 

CGGGCYTATT 


4980 


T A A A A A A A A A 

TCAACACGCC 


A A AT TT A AAA 

CGGTTTGCCC 


AT* TATA A A A 

GTATGTCAAG 


AAA A AATAAA 

ACCACCTGGA 


ATTATAAAA A 

GTTCTGGGAA 


AAAATATTT A 

GCGGTCTTTA 


5040 


AAA AT AT AAA 

CAGGTCTCAC 


tIAAAATTAAA 

WCACATTRAC 


A A A A A ATTAA 

GCCCACTTCC 


TATAAA A A AA 

TCTCCCAGAC 


AAA AA A A AA A 

GAAGCAAGGA 


A A A A A A A A AT 

GGAGAAAACT 


5100 


TTAAfVTi TAT 

TTGCRTATCT 


A A AA AAAT A A 

AACGGCCTAC 


AA AAAAA AA A 

CAGGCCACAG 


TA TAAAAAA A 

TATGCGCCAG 


AAAA A AAAAA 

GGCAAAGGCC 


AATAATAATT 

CCTCCTCCTT 


5160 


A AT AAA A A AT 

CGTGGGACGT 


AATATAAA A A 

GATGTGGAAG 


T AT A T A A ATA 

TGTCTAACTA 


A. A a& T" A AAA A A 

GGCTCAAACC 


TAAAATAAAT 

TACACTGACT 


AATAAA A AAA 

GGTCCCACCC 


5220 


AAATAATAT A 

CCCTCCTGTA 


AAAATTAAAT 

CCGCTTGGGT 


/V A A A T A A A A A 

GCCGTGACCA 


A T AAA. A T 111 A 

ATGAGGTYAC 


A TT A A AAAA A 

CTTGACGCAC 


AAAATA A AAA 

CCCGTGACGA 


5280 


AATACATCGC 


CACGTGCATG 


CAAGCTGACC 


TYGAGATCAT 


GACAAGCTCA 


T" A A A "V A A A A 

TGGGTCCTGG 


5340 


CGGGGGGGGT 


GCTAGCCGCC 


GTGGCAGCTT 


ACTGCCTGGC 


GACTGGCTGC 


A T *^ A A A A A 

ATTTCCATCA 


5400 


Y ▼ A #\ #v A A /V A 

TTGGCCGCCT 


ACACCTGAAT 


GATCGGGTGG 


TTGTGRCCCC 


YGACAAGGAR 


ATCTTATATG 


5460 


AGGCCTTTGA 


TGAGATGGAA 


GAATGCGCCT 


CCAAAGCCGC 


CCTCATTGAG 


GAAGGGCAGC 


5520 


A\ A A A A A /V A A 

GGATGGCGGA 


A A T' A. A TT 1 A AAA 

GATGCTCAAA 


TCTAAGATAC 


AAGGCCTCCT 


ACAACAGGCC 


ACAAGGCAAG 


5580 


ATA A A AAAA T 

CTCAAGRCAT 


RCAGCCAGCT 


ATACAGTCAT 


CATGGCCCAA 


GCTTGAACAA 


TTTTGGGCCA 


5640 


A A A A A A A TT" /V 

AACACATGTG 


GAACTTCATC 


AGTGGTATAC 


AGTACCTAGC 


AGGACTCTCC 


ACCCTACCGG 


5700 


A A A A T A A T A A 

GAAATCCTGC 


A A ~T~ ffV /V A J T* A A 

AGTRGCATCA 


ATGATGGCTT 


TTAGCGCCGC 


GCTGACTAGC 


CCACTACCCA 


5760 


AAA A A A A A A A 

CCAGCACCAC 


A A TAATATTA 

CATCCTCTTG 


AACATCATGG 


GAGGATGCTT 


GGCCTCYCAG 


A ^ A A A A A A A 

ATTGCCCCCC 


5820 


nTAAAAAl AA 

CTGCCGGAGC 


A a AWAAATTA 

CACYGGCTTC 


ATTATAA ATA 

GTTGTCAGTG 


A "r* A T* A A T A A A 

GTCTAGTGGG 


AAA A A A A A T* A 

GGCGGCCGTC 


AA A A A A. A T" A A 

GGAAGCATAG 


5880 


AAATAAAT A a 

GCCTGGGTAA 


A A Tl ATAATA 

GATACTGGTG 


AAA AT TTT A /V 

GACGTTTTGG 


CCGGGTACGG 


AAAA A A A A 

CGCAGGCATT 


T a A A A A. A A A A 

TCAGGGGCCC 


5940 


TCGTAGCTTT 


TAAGATCATG 


AGCGGCGAGA 


AGCCCACGGT 


AGAAGACGTT 


GTGAATCTCC 


6000 


TGCCTGCTAT 


YCTGTCTCCT 


GGTGCGYTGG 


TAGTGGGAGT 


CATCTGTGCA 


GCAATYCTGC 


6060 


GCCGCCACGT 


CGGTCAGGGA 


GAGGGRGCGG 


TCCAGTGGAT 


GAACAGACTG 


ATCGCCTTCG 


6120 


CCTCCAGGGG 


AAACCACGTT 


GCCCCTACCC 


ACTACGTGGT 


GGAGTCTGAC 


GCTTCACAGC 


6180 


GTGTRACGCA 


GGTGCTGAGT 


TCACTTACAA 


TTACCAGCTT 


ACTTAGGAGA 


CTACATGCCT 


6240 


GGATCACTGA 


AGATTGCCCA 


RTCCCATGCT 


CGGGGTCTTG 


GCTCCAGGAC 


ATTTGGGATT 


6300 1 


GGGTTTGTTC 


CATCCTCACA 


GACTTYAAAA 


ACTGGCTGTC 


TTCAAAATTA 


CTCCCCAAGA 


6360 


TGCCCGGCAT 


TCCCTTTATC 


TCTTGCCAGA 


AGGGATACAA 


GGGTGTATGG 


GCTGGTACGG 


6420 
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GTGTCATGAC YACTCGRTRC CCATGTGGAG 
CCATGAAAAT AACAGGCCCG AAGACTTGCT 
ATTGTTACAC AGAAGGGCCY TGCGTGCCAA 
GGAGGGTGGC AGCGTCGGAG TACGTTGAGG 
CRGGGTTAAC CAGTGACAAC CTTAAGGTYC 
CTTGGGTGGA CGGGGTGCAA ATCCACCGAT 
ATGAGGTAAC GTTCACCGTA GGCCTTAACT 
ATCCTGAGCC GGACACCGAR GTACTGGCCT 
CKGAGGCGGC AGCCAGGCGA TTGGCAAGGG 
CGAGCCAGCT CTCTGCCCCG TCCTTGAAGG 
ATTGTGACAT GGTGGATGCY AACCTTTTCA 
ACTCTAAGGT GATCGTTCTA GACTCCCTCG 
AGCCTTCTGT ACCATCAGAG TACCTGATCA 
CTTGGGCCCG TCCAGACTAC AATCCTGTTT 
AACCACCCAC TGTCCTAGGC TGTGCCCTCC 
CTCGGAGGCG CCGCGCYAAA RTCCTGACCC 
TGGCTGACAA AGTRCTCAGC CCTCTCCAAG 
CGGATACCGG AGGAGACATC GTCCAGCAAC 
GGTCACTGTC CTCCATGCCT CCCCTTGAGG 
AACCAGTGGG ATCCGCTCCC CCTTCTGAGG 
AGTCGTGGTC CACAGTCTCT GATCAAGAGG 
CCTGGACGGG GGCCCTCATA ACACCATGTG 
CTCTGAGTAA TTCGCTCATG CGGTTCCATA 
CCTCTCTGAG GGCAAAGAAG GTGACTTTTG 
ACTCAGTCTT GCAGGACGTT AAGCGGGCCG 
TAGAGGAAGC CTGCGCGCTG ACCCCGCCCC 
CAAAAGAGGT GCGCAGCTTA TCCAGGAGGG 
ACCTCCTGGA AGACCAACRT ACCCCAATTG 
TCTGCATTGA TCCAACTAAR GGTGGGAAAA 



CAAACATCTC GGGCCATGTC CGCATGGGCA oao;; 
TGAACCTGTG GCAGGGGACT TTCCCCATTA 354,; 
AACCCCCTCC TAATTACAAG ACCGCAATTT 8600 
TCACACAGCA TGGCTCTTTC TCGTATGTAA 6660 
CTTGCCAGGT ACCAGCTCCA GAATTTTTCT 6720 
TCGCCCCCGT WCCAGGTCCC TTCTTTCGGG 6780 
CCTTCGTGGT CGGCTCTCAG CTCCCTTGCG 6840 
CYATGTTGAC AGACCCGTCC CACATCACCG 6900 
GATCTCCCCC YTCACAGGCT AGCTCCTCAG 6960 
CTACCTGTAC CACCCATAAG ACAGCATATG 7020 
TGGGAGGHGA TGTGAYCCGG ATTGAGTCTG 7080 
ATTCCATGAC TGAGGTAGAG GATGATCGTG 7140 
AGAGGAGAAA GTTCCCACCG GCGCTGCCTC 7200 
TGATCGAGAC ATGGAAGAGG CCGGGCTATG 7260 
CCCCCACACY TCAAACGCCA GTGCCTCCAC 7320 
AGGACRATGT GGAGGGGRTC CTCAGGGAGA 7380 
ACAACAATGA CTCCGGTCAC TCCACTGGAG 7440 
CCTCTGACGA GACTGCCGCT TCAGAAGCGG 7500 
GAGAGCCGGG AGACCCYGAC CTGGAGTTTG 7560 
GGGAGTGTGA GGTCATTGAT TCGGACTCTA 7620 
ATTCTGTTAT CTGCTGCTCT ATGTCATACT 7680 
GGCCCGAAGA GGAGAAGTTA CCGATCAACC 7740 
AYAAGGTGTA CTCCACAACC TCGAGGAGTG 7800 
ACAGGGTGCA GGTGCTGGAC GCACACTATG 7860 
CCTCTAAGGT TRGTGCGAGG CTCCTCACAG 7920 
ACTCCGCCAA ATCGCGATAC GGATTTGGGG 7980 
CCGTTAACCA CATCCGGTCC GTGTGGGAGG 8040 
ACACAACTAT CATGGCTAAA AATGAGGTGT 8100 
AGCCAGCTCG CCTCATCGTA TACCCCGACC 8160 



63 



EP0 532 167 A2 



TTGGGGTCAG GGTGTGCGAA AAGATGGCCC TCTATGACAT CRCACAAAAG CTTCCCAAAG 8220 
CGATAATGGG GCCATCCTAT GGGTTCCAAT ACTCTCCCGC AGAACGGGTC GATTTCCTCC 8280 
TCAAAGCTTG GGGAAGTAAG AAGGACCCAA TGGGGTTCTC GTATGACACC CGCTGCTTTG 8340 
ACTCAACCGT CACGGAGAGG GACATAAGAA CAGAAGAATC CATATATCAG GCTTGTTCTC 8400 
TGCCTCAAGA AGCCAGAACT GTCATACACT CGCTCACTGA GAGACTTTAC GTAGGAGGGC 8460 
CCATGACAAA CAGCAAAGGG CAATCCTGCG GCTACAGGCG TTGCCGCGCA AGCGGKGTTT 8520 
TCACCACCAG CATGGGGAAT ACCATGACAT GTTACATCAA AGCCCTTGCA GCGTGTAAGG 8580 
CTGCRGGGAT CGTGGACCCT GTTATGTTGG TGTGTGGAGA CGACCTGGTC GTCATCTCAG 8640 
AGAGCCAAGG TAACGAGGAG GACGAGCGAA ACCTGAGAGC TTTCACGGAG GCTATGACCA 8700 
GGTATTCCGC CCCTCCCGGT GACCTTCCCA GACCGGAATA TGACTTGGAG CTTATAACAT 8760 
CCTGCTCCTC AAACGTATCG GTAGCGCTGG ACTCTCGGGG TCGCCGCCGG TACTTCCTAA 8820 
CCAGAGACCC TACCACTCCA ATCACCCGAG CTGCTTGGGA AACAGTAAGA CACTCCCCTG 8880 
TCAATTCTTG GCTGGGCAAC ATCATCCAGT ACGCCCCCAC AATCTGGGTC CGGATGGTCA 8940 
TAATGACTCA CTTCTTCTCC ATACTATTGG CCCAGGACAC TCTGAACCAA AATCTCAATT 9000 
TTGAGATGTA CGGGGCAGTA TACTCGGTCA ATCCATTAGA CCTACCGGCC ATAATTGAAA 9060 
GGCTACATGG GCTTGAAGCC TTTTCACTGC ACACATACTC TCCCCACGAA CTCTCACGGG 9120 
TGGCAGCAAC TCTCAGAAAA CTTGGAGCGC CTCCCCTTAG AGCGTGGAAG AGTCGGGCGC 9180 
GTGCCGTGAG AGCTTCACTC ATCGCCCAAG GAGCGAGGGC GGCCATTTGT GGCCGCTACC 9240 
TCTTCAACTG GGCGGTGAAA ACAAAGCTCA AACTCACTCC ATTGCCCGAG GCGAGCCGCC 9300 
TGGATTTATC CGGGTGGTTC ACCGTGGGCG CCGGCGGGGG CGACATTTAT CACAGCGTGT 9360 
CGCATGCYCG ACCCCGCCTA TTACTCCTTT GCCTACTCCT ACTTAGCGTA GGAGTAGGCA 9420 
TCTTTTTACT CCCCGCTCGG TAGAGCGGCA AACYCTAGCT ACACTCCATA GCTAGTTTCC 9480 
GTTTTTTTTT TTTTTTTTTT TTTTTTTTTT T 9511 
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Sequence ID No. "B 
Sequence Length: 3,033 
Sequence Type: amino acid 
Topology: linear 
Molecule Type: protein 



Het Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr 
5 10 15 

Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin He 
20 25 30 

Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly 
35 40 45 

Val Arg Ala Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly 
50 55 60 

Arg Arg Gin Pro He Pro Lys Asp Arg Arg Ser Thr Gly Lys Ser 
65 70 75 

Trp Gly Lys Pro Gly Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly 
80 85 90 

Cys Gly Trp Ala Gly Trp Leu Leu Ser Pro Arg Gly Ser Arg Pro 
95 100 105 

Thr Trp Gly Pro Thr Asp Pro Arg His Arg Ser Arg Asn Leu Gly 
110 115 120 

Arg Val He Asp Thr He Thr Cys Gly Phe Ala Asp Leu Het Gly 
125 130 135 

Tyr He Pro Val Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala 
140 145 150 

Leu Ala His Gly Val Arg Val Leu Glu Asp Gly He Asn Tyr Ala 
155 160 165 
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Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser lie Phe Leu Leu Ala 

170 175 180 

Leu Leu Ser Cys Val Thr Val Pro Val Ser Ala Val Glu Val Arg 

185 190 195 

Asn He Ser Ser Ser Tyr Tyr Ala Thr Asn Asp Cys Ser Asn Asn 

200 205 210 

Ser He Thr Trp Gin Leu Thr Asp Ala Val Leu His Leu Pro Gly 

215 220 225 

Cys Val Pro Cys Glu Asn Asp Asn Gly Thr Leu His Cys Trp He 

230 235 240 

Gin Val Thr Pro Asn Val Ala Val Lys His Arg Gly Ala Leu Thr 

245 250 255 

Arg Ser Leu Arg Thr His Val Asp Met He Val Het Ala Ala Thr 

260 265 270 

Ala Cys Ser Ala Leu Tyr Val Gly Asp Val Cys Gly Ala Val Het 

275 280 285 

He Leu Ser Gin Ala Phe Het Val Ser Pro Gin Arg His Asn Phe 

290 295 300 

Thr Gin Glu Cys Asn Cys Ser He Tyr Gin Gly His He Thr Gly 

305 310 315 

His Arg Het Ala Trp Asp Het Het Leu Ser Trp Ser Pro Thr Leu 

320 325 330 

Thr Het lie Leu Aia Tyr Ala Ala Arg Val Pro Glu Leu Val Leu 

335 340 345 

Glu He He Phe Gly Gly His Trp Gly Val Val Phe Gly Leu Ala 

350 355 360 

Tyr Phe Ser Het Gin Gly Ala Trp Ala Lys Val He Ala He Leu 

365 370 375 

Leu Leu Val Ala Gly Val Asp Ala Thr Thr Tyr Ser Ser Gly Gin 
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Glu Ala Gly 
Gly Ala Lys 
His He Asn 
Gly Phe Leu 
Gly Cys Pro 
Arg He Gly 
Asp Gly Asp 
Cys Gly He 
Phe Thr Pro 
Val Pro Thr 
Leu Asn Ser 
Trp Met Asn 
Cys Arg He 
Pro Thr Asp 



380 
Arg Thr Val 

395 
Gin Asn Leu 

410 
Arg Thr Ala 

425 
Ala Ser Leu 

440 
Glu Arg Leu 

455 
Trp Gly Thr 

470 
Met Arg Pro 

485 
Val Pro Ala 

500 
Ser Pro Val 

515 
Tyr Thr Trp 

530 
Thr Arg Pro 

545 
Gly Thr Gly 

560 
Arg Lys Asp 

575 
Cys Phe Arg 

590 



Ala Gly Phe 
Tyr Leu He 
Leu Asn Cys 
Phe Tyr Thr 
Ser Ser Cys 
Leu Glu Tyr 
Tyr Cys Trp 
Arg Thr Val 
Val Val Gly 
Gly Glu Asn 
Pro Arg Gly 
Phe Thr Lys 
Tyr Asn Ser 
Lys His Pro 



385 

Ala Gly Leu 
400 

Asn Thr Asn 
415 

Asn Asp Ser 
430 

His Lys Phe 
445 

Arg Gly Leu 
460 

Glu Thr Asn 
475 

His Tyr Pro 
490 

Cys Gly Pro 
505 

Thr Thr Asp 
520 

Glu Thr Asp 
535 

Ala Trp Phe 
550 

Thr Cys Gly 
565 

Thr He Asp 
580 

Asp Ala Thr 
595 



390 

Phe Thr Thr 
405 

Gly Ser Trp 
420 

Leu Gin Thr 
435 

Asn Ser Ser 
450 

Asp Asp Phe 
465 

Val Thr Asn 
480 

Pro Arg Pro 
495 

Val Tyr Cys 
510 

Lys Gin Gly 
525 

Val Phe Leu 
540 

Gly Cys Thr 
555 

Ala Pro Pro 
570 

Leu Leu Cys 
585 

Tyr Leu Lys 
600 
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Cys Gly Ala Gly Pro Trp Leu Thr Pro Arg Cys Leu Val Asp Tyr 

605 610 615 

Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr Val Asn Phe Thr He 

620 625 630 

Phe Lys Ala Arg Met Tyr Val Gly Gly Val Glu His Arg Phe Ser 

635 640 645 

Ala Ala Cys Asn Phe Thr Arg Gly Asp Arg Cys Arg Leu Glu Asp 

650 655 660 

Arg Asp Arg Gly Gin Gin Ser Pro Leu Leu His Ser Thr Thr Glu 

665 670 675 

Trp Ala Val Leu Pro Cys Ser Phe Ser Asp Leu Pro Ala Leu Ser 

680 685 690 

Thr Gly Leu Leu His Leu His Gin Asn He Val Asp Val Gin Tyr 

695 700 705 

Leu Tyr Gly Leu Ser Pro Ala Leu Thr Arg Tyr He Val Lys Trp 

710 715 720 

Glu Trp Val He Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg He 

725 730 735 

Cys Ala Cys Leu Trp Met Leu He He Leu Gly Gin Ala Glu Ala 

740 745 750 

Ala Leu Glu Lys Leu He He Leu His Ser Ala Ser Ala Ala Ser 

755 760 765 

Ala Asn Gly Pro Leu Trp Phe Phe He Phe Phe Thr Ala Ala Trp 

770 775 780 

Tyr Leu Lys Gly Arg Val Val Pro Val Ala Thr Tyr Ser Val Leu 

785 790 795 

Gly Leu Trp Ser Phe Leu Leu Leu Val Leu Ala Leu Pro Gin Gin 

800 805 810 

Ala Tyr Ala Leu Asp Ala Ala Glu Gin Gly Glu Leu Gly Leu Ala 



68 



EP 0 532 167 A2 



815 820 825 

He Leu Val He He Ser He Phe Thr Leu Thr Pro Ala Tyr Lys 

830 835 840 

lie Leu Leu Ser Arg Ser Val Trp Trp Leu Ser Tyr Met Leu Val 

845 850 855 

Leu Ala Glu Ala Gin He Gin Gin Trp Val Pro Pro Leu Glu Val 

860 865 870 

Arg Gly Gly Arg Asp Gly lie lie Trp Val Ala Val He Leu His 

875 880 885 

Pro Arg Leu Val Phe Glu Val Thr Lys Trp Leu Leu Ala He Leu 

890 895 900 

Gly Pro Ala Tyr Leu Leu Lys Ala Ser Leu Leu Arg lie Pro Tyr 

905 910 915 

Phe Val Arg Ala His Ala Leu Leu Arg Val Cys Thr Leu Val Lys 

920 925 930 

His Leu Ala Gly Ala Arg Tyr He Gin Het Leu Leu He Thr lie 

935 940 945 

Gly Arg Trp Thr Gly Thr Tyr He Tyr Asp His Leu Ser Pro Leu 

950 955 960 

Ser Thr Trp Ala Ala Gin Gly Leu Arg Asp Leu Ala He Ala Val 

965 970 975 

Glu Pro Val Val Phe Ser Pro Met Glu Lys Lys Val He Val Trp 

980 985 990 

Gly Ala Glu Thr Val Ala Cys Gly Asp He Leu His Gly Leu Pro 

995 1000 1005 

Val Ser Ala Arg Leu Gly Arg Glu Val Leu Leu Gly Pro Ala Asp 

1010 1015 1020 

Gly Tyr Thr Ser Lys Gly Trp Lys Leu Leu Ala Pro lie Thr Ala 

1025 1030 1035 
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Tyr Thr Gin Gin Thr Arg Gly Leu Leu Gly Ala He Val Val Ser 
1040 1045 1W0 

Leu Thr Gly Arg Asp Lys Asn Glu Gin Ala Gly Gin Val Gin Val 
1055 1060 1065 

Leu Ser Ser Val Thr Gin Thr Phe Leu Gly Thr Ser He Ser Gly 
1070 1075 1080 

Val Leu Trp Thr Val Tyr His Gly Ala Gly Asn Lys Thr Leu Ala 
1085 1090 1095 

Gly Pro Lys Gly Pro Val Thr Gin Het Tyr Thr Ser Ala Glu Gly 
1100 1105 1110 

Asp Leu Val Gly Trp Pro Ser Pro Pro Gly Thr Lys Ser Leu Asp 
1115 1120 1125 

Pro Cys Thr Cys Gly Ala Val Asp Leu Tyr Leu Val Thr Arg Asn 
1130 1135 1140 

Ala Asp Val He Pro Val Arg Arg Lys Asp Asp Arg Arg Gly Ala 
1145 1150 1155 

Leu Leu Ser Pro Arg Pro Leu Ser Thr Leu Lys Gly Ser Ser Gly 
1160 1165 1170 

Gly Pro Val Leu Cys Ser Arg Gly His Ala Val Gly Leu Phe Arg 
1175 1180 1185 

Ala Ala Val Cys Ala Arg Gly Val Ala Lys Ser He Asp Phe He 
1190 1195 1200 

Pro Val Glu Ser Leu Asp Val Ala Thr Arg Thr Pro Ser Phe Ser 
1205 1210 1215 

Asp Asn Ser Thr Pro Pro Ala Val Pro Gin Ser Tyr Gin Val Gly 
1220 1225 1230 

Tyr Leu His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro 
1235 1240 1245 

Ala Ala Tyr Ala Ser Gin Gly Tyr Lys Val Leu Val Leu Asn Pro 
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1250 1255 1260 

Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala 

1265 1270 1275 

His Gly He Asn Pro Asn He Arg Thr Gly Val Arg Thr Val Thr 

1280 1285 1290 

Thr Gly Asp Ser He Thr Tyr Ser Thr Tyr Gly Lys Phe He Ala 

1295 1300 1305 

Asp Gly Gly Cys Ala Ala Gly Ala Tyr Asp He He He Cys Asp 

1310 1315 1320 

Glu Cys His Ser Val Asp Ala Thr Thr He Leu Gly He Gly Thr 

1325 1330 1335 

Val Leu Asp Gin Ala Glu Thr Ala Gly Val Arg Leu Val Val Leu 

1340 1345 1350 

Ala Thr Ala Thr Pro Pro Gly Thr Val Thr Thr Pro His Ser Asn 

1355 1360 1365 

He Glu Glu Val Ala Leu Gly His Glu Gly Glu He Pro Phe Tyr 

1370 1375 1380 

Gly Lys Ala He Pro Leu Ala Phe He Lys Gly Gly Arg His Leu 

1385 1390 1395 

He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Ala 

1400 1405 1410 

Leu Arg Gly Met Gly Val Asn Ala Val Ala Tyr Tyr Arg Gly Leu 

1415 1420 1425 

Asp Val Ser Val He Pro Thr Gin Gly Asp Val Val Val Val Ala 

1430 1435 1440 

Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val 

1445 1450 1455 

He Asp Cys Asn Val Ala Val Ser Gin He Val Asp Phe Ser Leu 

1460 1465 1470 
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Asp Pro Thr Phe Thr He Thr Thr Gin Thr Val Pro Gin Asp Ala 

1475 1480 1485 

Val Ser Arg Ser Gin Arg Arg Gly Arg Thr Gly Arg Gly Arg Leu 

1490 1495 1500 

Gly Val Tyr Arg Tyr Val Ser Ser Gly Glu Arg Pro Ser Gly Het 

1505 1510 1515 

Phe Asp Ser Val Val Leu Cys Glu Cys Tyr Asp Ala Gly Ala Ala 

1520 1525 1530 

Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala 

1535 1540 1545 

Tyr Phe Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu 

1550 1555 1560 

Phe Trp Glu Ala Val Phe Thr Gly Leu Thr His He Asp Ala His 

1565 1570 1575 

Phe Leu Ser Gin Thr Lys Gin Gly Gly Glu Asn Phe Ala Tyr Leu 

1580 1585 1590 

Thr Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Lys Ala Pro Pro 

1595 1600 1605 

Pro Ser Trp Asp Val Het Trp Lys Cys Leu Thr Arg Leu Lys Pro 

1610 1615 1620 

Thr Leu Thr Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val 

1625 1630 1635 

Thr Asn Glu Val Thr Leu Thr His Pro Val Thr Lys Tyr He Ala 

1640 1645 1650 

Thr Cys Het Gin Ala Asp Leu Glu He Het Thr Ser Ser Trp Val 

1655 1660 1665 

Leu Ala Gly Gly Val Leu Ala Ala Val Ala Ala Tyr Cys Leu Ala 

1670 1675 1680 

Thr Gly Cys He Ser He He Gly Arg Leu His Leu Asn Asp Arg 
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1685 1690 1695 

Val Val Val Ala Pro Asp Lys Glu lie Leu Tyr Glu Ala Phe Asp 
1700 1705 1710 

Glu Het Glu Glu Cys Ala Ser Lys Ala Ala Leu He Glu Glu Gly 
1715 1720 1725 

Gin Arg Het Ala Glu Het Leu Lys Ser Lys He Gin Gly Leu Leu 
1730 1735 1740 

Gin Gin Ala Thr Arg Gin Ala Gin Asp He Gin Pro Ala He Gin 
1745 1750 1755 

Ser Ser Trp Pro Lys Leu Glu Gin Phe Trp Ala Lys His Het Trp 
1760 1765 1770 

Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu 
1775 1780 1785 

Pro Gly Asn Pro Ala Val Ala Ser Het Het Ala Phe Ser Ala Ala 
1790 1795 1800 

Leu Thr Ser Pro Leu Pro Thr Ser Thr Thr He Leu Leu Asn He 
1805 1810 1815 

Het Gly Gly Trp Leu Ala Ser Gin He Ala Pro Pro Ala Gly Ala 
1820 1825 1830 

Thr Gly Phe Val Val Ser Gly Leu Val Gly Ala Ala Val Gly Ser 
1835 1840 1845 

He Gly Leu Gly Lys He Leu Val Asp Val Leu Ala Gly Tyr Gly 
1850 1855 1860 

Ala Gly He Ser Gly Ala Leu Val Ala Phe Lys He Het Ser Gly 
1865 1870 1875 

Glu Lys Pro Thr Val Glu Asp Val Val Asn Leu Leu Pro Ala He 
1880 1885 1890 

Leu Ser Pro Gly Ala Leu Val Val Gly Val He Cys Ala Ala He 
1895 1 900 1905 
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Leu Arg Arg His Val Gly Gin Gly Glu Gly Ala Val Gin Trp Het 

1910 1915 1920 

Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His Val Ala Pro 

1925 1930 1935 

Thr His Tyr Val Val Glu Ser Asp Ala Ser Gin Arg Val Thr Gin 

1940 1945 1950 

Val Leu Ser Ser Leu Thr Me Thr Ser Leu Leu Arg Arg Leu His 

1955 1960 1965 

Ala Trp He Thr Glu Asp Cys Pro Val Pro Cys Ser Gly Ser Trp 

1970 1975 1980 

Leu Gin Asp Me Trp Asp Trp Val Cys Ser He Leu Thr Asp Phe 

1985 1990 1995 

Lys Asn Trp Leu Ser Ser Lys Leu Leu Pro Lys Het Pro Gly lie 

2000 2005 2010 

Pro Phe He Ser Cys Gin Lys Gly Tyr Lys Gly Val Trp Ala Gly 

2015 2020 2025 

Thr Gly Val Het Thr Thr Arg Cys Pro Cys Gly Ala Asn He Ser 

2030 2035 2040 

Gly His Val Arg Het Gly Thr Het Lys He Thr Gly Pro Lys Thr 

2045 2050 2055 

Cys Leu Asn Leu Trp Gin Gly Thr Phe Pro He Asn Cys Tyr Thr 

2060 2065 2070 

Glu Gly Pro Cys Val Pro Lys Pro Pro Pro Asn Tyr Lys Thr Ala 

2075 2080 2085 

He Trp Arg Val Ala Ala Ser Glu Tyr Val Glu Val Thr Gin His 

2090 2095 2100 

Gly Ser Phe Ser Tyr Val Thr Gly Leu Thr Ser Asp Asn Leu Lys 

2105 2110 2115 

Val Pro Cys Gin Val Pro Ala Pro Glu Phe Phe Ser Trp Val Asp 
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2120 2125 2130 

Gly Val Gin He His Arg Phe Ala Pro Val Pro Gly Pro Phe Phe 

2135 2140 2145 

Arg Asp Glu Val Thr Phe Thr Val Gly Leu Asn Ser Phe Val Val 

2150 2155 2160 

Gly Ser Gin Leu Pro Cys Asp Pro Glu Pro Asp Thr Glu Val Leu 

2165 2170 2175 

Ala Ser Het Leu Thr Asp Pro Ser His He Thr Ala Glu Ala Ala 

2180 2185 2190 

Ala Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Gin Ala Ser Ser 

2195 2200 2205 

Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr 

2210 2215 2220 

Thr His Lys Thr Ala Tyr Asp Cys Asp Het Val Asp Ala Asn Leu 

2225 2230 2235 

Phe Het Gly Gly Asp Val Thr Arg He Glu Ser Asp Ser Lys Val 

2240 2245 2250 

He Val Leu Asp Ser Leu Asp Ser Het Thr Glu Val Glu Asp Asp 

2255 2260 2265 

Arg Glu Pro Ser Val Pro Ser Glu Tyr Leu lie Lys Arg Arg Lys 

2270 2275 2280 

Phe Pro Pro Ala Leu Pro Pro Trp Ala Arg Pro Asp Tyr Asn Pro 

2285 2290 2295 

Val Leu He Glu Thr Trp Lys Arg Pro Gly Tyr Glu Pro Pro Thr 

2300 2305 2310 

Val Leu Gly Cys Ala Leu Pro Pro Thr Pro Gin Thr Pro Val Pro 

2315 2320 2325 

Pro Pro Arg Arg Arg Arg Ala Lys Val Leu Thr Gin Asp Asn Val 

2330 2335 2340 
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Glu Gly Val Leu Arg Glu Met Ala Asp Lys Val Leu Ser Pro Leu 

2345 2350 2355 

Gin Asp Asn Asn Asp Ser Gly His Ser Thr Gly Ala Asp Thr Gly 

2360 2365 2370 

Gly Asp He Val Gin Gin Pro Ser Asp Glu Thr Ala Ala Ser Glu 

2375 2380 2385 

Ala Gly Ser Leu Ser Ser Het Pro Pro Leu Glu Gly Glu Pro Gly 

2390 2395 2400 

Asp Pro Asp Leu Glu Phe Glu Pro Val Gly Ser Ala Pro Pro Ser 

2405 2410 2415 

Glu Gly Glu Cys Glu Val He Asp Ser Asp Ser Lys Ser Trp Ser 

2420 2425 2430 

Thr Val Ser Asp Gin Glu Asp Ser Val lie Cys Cys Ser Het Ser 

2435 2440 2445 

Tyr Ser Trp Thr Gly Ala Leu He Thr Pro Cys Gly Pro Glu Glu 

2450 2455 2460 

Glu Lys Leu Pro He Asn Pro Leu Ser Asn Ser Leu Het Arg Phe 

2465 2470 2475 

His Asn Lys Val Tyr Ser Thr Thr Ser Arg Ser Ala Ser Leu Arg 

2480 2485 2490 

Ala Lys Lys Val Thr Phe Asp Arg Val Gin Val Leu Asp Ala His 

2495 2500 2505 

Tyr Asp Ser Val Leu Gin Asp Val Lys Arg Ala Ala Ser Lys Val 

2510 2515 2520 

Ser Ala Arg Leu Leu Thr Val Glu Glu Ala Cys Ala Leu Thr Pro 

2525 2530 2535 

Pro His Ser Ala Lys Ser Arg Tyr Gly Phe Gly Ala Lys Glu Val 

2540 2545 2550 

Arg Ser Leu Ser Arg Arg Ala Val Asn His He Arg Ser Val Trp 
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2555 2560 2565 

Glu Asn Leu Leu Glu Asp Gin His Thr Pro He Asp Thr Thr He 

2570 2575 2580 

Het Ala Lys Asn Glu Val Phe Cys He Asp Pro Thr Lys Gly Gly 

2585 2590 2595 

Lys Lys Pro Ala Arg Leu He Val Tyr Pro Asp Leu Gly Val Arg 

2600 2605 , 2610 

Val Cys Glu Lys Het Ala Leu Tyr Asp He Ala Gin Lys Leu Pro 

2615 2620 2625 

Lys Ala He Het Gly Pro Ser Tyr Gly Phe Gin Tyr Ser Pro Ala 

2630 2635 2640 

Glu Arg Val Asp Phe Leu Leu Lys Ala Trp Gly Ser Lys Lys Asp 

2645 2650 2655 

Pro Het Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val 

2660 2665 2670 

Thr Glu Arg Asp He Arg Thr Glu Glu Ser He Tyr Gin Ala Cys 

2675 2680 2685 

Ser Leu Pro Gin Glu Ala Arg Thr Val He His Ser Leu Thr Glu 

2690 2695 2700 

Arg Leu Tyr Val Gly Gly Pro Het Thr Asn Ser Lys Gly Gin Ser 

2705 2710 2715 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Phe Thr Thr Ser 

2720 2725 2730 

Het Gly Asn Thr Het Thr Cys Tyr He Lys Ala Leu Ala Ala Cys 
2735 2740 2745 

Lys Ala Ala Gly He Val Asp Pro Val Het Leu Val Cys Gly Asp 
2750 2755 2760 

Asp Leu Val Val He Ser Glu Ser Gin Gly Asn Glu Glu Asp Glu 
2765 2770 2775 
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Arg Asn Leu Arg Ala Phe Thr Glu Ala Het Thr Arg Tyr Ser Ala 
2780 2785 2790 

Pro Pro Gly Asp Leu Pro Arg Pro Glu Tyr Asp Leu Glu Leu He 
2795 2800 2805 

Thr Ser Cys Ser Ser Asn Val Ser Val Ala Leu Asp Ser Arg Gly 
2810 2815 2820 

Arg Arg Arg Tyr Phe Leu Thr Arg Asp Pro Thr Thr Pro He Thr 
2825 2830 2835 

Arg Ala Ala Trp Glu Thr Val Arg His Ser Pro Val Asn Ser Trp 
2840 2845 2850 

Leu Gly Asn He He Gin Tyr Ala Pro Thr He Trp Val Arg Het 
2855 2860 2865 

Val lie Het Thr His Phe Phe Ser He Leu Leu Ala Gin Asp Thr 
2870 2875 2880 

Leu Asn Gin Asn Leu Asn Phe Glu Het Tyr Gly Ala Val Tyr Ser 
2885 2890 2895 

Val Asn Pro Leu Asp Leu Pro Ala lie He Glu Arg Leu His Gly 
2900 2905 2910 

Leu Glu Ala Phe Ser Leu His Thr Tyr Ser Pro His Glu Leu Ser 
2915 2920 2925 

Arg Val Ala Ala Thr Leu Arg Lys Leu Gly Ala Pro Pro Leu Arg 
2930 2935 2940 

Ala Trp Lys Ser Arg Ala Arg Ala Val Arg Ala Ser Leu He Ala 
2945 2950 2955 

Gin Gly Ala Arg Ala Ala He Cys Gly Arg Tyr Leu Phe Asn Trp 
2960 2965 2970 

Ala Val Lys Thr Lys Leu Lys Leu Thr Pro Leu Pro Glu Ala Ser 
2975 2980 2985 

Arg Leu Asp Leu Ser Gly Trp Phe Thr Val Gly Ala Gly Gly Gly 
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2990 2995 3000 

Asp He Tyr His Ser Val Ser His Ala Arg Pro Arg Leu Leu Leu 

3005 3010 3015 

Leu Cys Leu Leu Leu Leu Ser Val Gly Val Gly lie Phe Leu Leu 

3020 3025 3030 

Pro Ala Arg 
3033 
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Sequence ID No. °l 
Sequence Length: 3,033 
Sequence Type: amino acid 
Topology: linear 
Molecule Type: protein 



Het Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr 

5 10 15 

Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin He 

20 25 30 

Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly 

35 40 45 

Val Arg Ala Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly 

50 55 60 

Arg Arg Gin Pro He Pro Lys Asp Arg Arg Ser Thr Gly Lys Ser 

65 70 75 

Trp Gly Lys Pro Gly Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly 

80 85 90 

Cys Gly Trp Ala Gly Trp Leu Leu Ser Pro Arg Gly Ser Arg Pro 

95 100 105 

Thr Trp Gly Pro Thr Asp Pro Arg His Arg Ser Arg Asn leu Gly 

110 115 120 

Arg Val He Asp Thr He Thr Cys Gly Phe Ala Asp Leu Het Gly 

125 130 135 

Tyr He Pro Val Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala 

140 145 150 

Leu Ala His Gly Val Arg Val Leu Glu Asp Gly He Asn Tyr Ala 

155 160 165 
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Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser He Phe Leu Leu Ala 

170 175 180 

Leu Leu Ser Cys Val Thr Het Pro Val Ser Ala Val Glu Val Arg 

185 190 195 

Asn He Ser Ser Ser Tyr Tyr Ala Thr Asn Asp Cys Ser Asn Asn 

200 205 210 

Ser He Thr Trp Gin Leu Thr Asp Ala Val Leu His Leu Pro Gly 

215 220 225 

Cys Val Pro Cys Glu Asn Asp Asn Gly Thr Leu Arg Cys Trp lie 

230 235 240 

Gin Val Thr Pro Asp Val Ala Val Lys His Arg Gly Ala Leu Thr 

245 250 255 

Arg Ser Leu Arg Thr His Val Asp Het He Val Het Ala Ala Thr 

260 265 270 

Ala Cys Ser Ala Leu Tyr Val Gly Asp Val Cys Gly Ala Val Het 

275 280 285 

He Leu Ser Gin Ala Phe Het Val Ser Pro Gin Arg His Asn Phe 

290 295 300 

Thr Gin Glu Cys Asn Cys Ser He Tyr Gin Gly His He Thr Gly 

305 310 315 

His Arg Het Ala Trp Asp Het Het Leu Asn Trp Ser Pro Thr Leu 

320 325 330 

Ala Het He Leu Ala Tyr Ala Ala Arg Val Pro Glu Leu Val Leu 

335 340 345 

Glu lie He Phe Gly Gly His Trp Gly Val Ala Phe Gly Leu Gly 

350 355 360 

Tyr Phe Ser Het Gin Gly Ala Trp Ala Lys Val Val Ala He Leu 

365 370 375 

Leu Leu Val Ala Gly Val Asp Ala Ser Thr Tyr Ser Thr Gly Gin 



81 



EPO 532 167 A2 



380 385 390 

Gin Ala Gly Arg Ala Ala Tyr Gly He Ser Ser Leu Phe Asn Thr 

395 400 405 

Gly Ala Lys Gin Asn Leu His Leu He Asn Thr Asn Gly Ser Trp 

410 415 420 

His He Asn Arg Thr Ala Leu Asn Cys Asn Asp Ser Leu Glu Thr 

425 430 435 

Gly Phe He Ala Ser Leu Val Tyr Tyr Arg Arg Phe Asn Ser Ser 

440 445 450 

Gly Cys Pro Glu Arg Leu Ser Ser Cys Arg Gly Leu Asp Asp Phe 

455 460 465 

Arg He Gly Trp Gly Thr Leu Glu Tyr Glu Thr Asn Val Thr Asn 

470 475 480 

Asp Glu Asp Met Arg Pro Tyr Cys Trp His Tyr Pro Pro Arg Pro 

485 490 495 

Cys Gly He Val Pro Ala Arg Thr Val Cys Gly Pro Val Tyr Cys 

500 505 510 

Phe Thr Pro Ser Pro Val Val Val Gly Thr Thr Asp Lys Gin Gly 

515 520 525 

Val Pro Thr Tyr Thr Trp Gly Glu Asn Glu Thr Asp Val Phe Leu 

530 535 540 

Leu Asn Ser Thr Arg Pro Pro Arg Gly Ala Trp Phe Gly Cys Thr 

545 550 555 

Trp Het Asn Gly Thr Gly Phe Thr Lys Thr Cys Gly Ala Pro Pro 

560 565 570 

Cys Arg He Arg Lys Asp Tyr Asn Ser ThrJIe Asp Leu Leu Cys 

575 580 585 

Pro Thr Asp Cys Phe Arg Lys His Pro Asp Ala Thr Tyr Leu Lys 

590 595 600 
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Cys Gly Ala Gly Pro Trp Leu Thr Pro Arg Cys Leu Val Asp Tyr 

605 610 615 

Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr Val Asn Phe Thr He 

620 625 630 

Phe Lys Ala Arg Met Tyr Val Gly Gly Val Glu His Arg Phe Ser 

635 640 645 

Ala Ala Cys Asn Phe Thr Arg Gly Asp Arg Cys Arg Leu Glu Asp 

650 655 660 

Arg Asp Arg Gly Gin Gin Ser Pro Leu Leu His Ser Thr Thr Glu 

665 670 675 

Trp Ala Val Phe Pro Cys Ser Phe Ser Asp Leu Pro Ala Leu Ser 

6S0 685 690 

Thr Gly Leu Leu His Leu His Gin Asn He Val Asp Val Gin Tyr 

695 700 70S 

Leu Tyr Gly Leu Ser Pro Ala Leu Thr Arg Tyr He Val Lys Trp 

710 715 720 

Glu Trp Val He Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val 

725 730 735 

Cys Ala Cys Leu Trp Met Leu Asn He Leu Gly Gin Ala Glu Ala 

740 745 750 

Ala Leu Glu Lys Leu He lie Leu His Ser Ala Ser Ala Ala Ser 

755 760 765 

Ala Asn Gly Pro Leu Trp Phe Phe He Phe Phe Thr Ala Ala Trp 

770 775 780 

Tyr Leu Lys Gly Arg Val Val Pro Val Ala Thr Tyr Ser Val Leu 

785 790 795 

Gly Leu Trp Ser Phe Leu Leu Leu Val Leu Ala Leu Pro Gin Gin 

800 805 810 

Ala Tyr Ala Leu Asp Ala Ala Glu Gin Gly Glu Leu Gly Leu Ala 
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815 820 825 

He Leu Val He He Ser He Phe Thr Leu Thr Pro Ala Tyr Lys 

830 835 840 

He Leu Leu Ser Arg Ser Val Trp Trp Leu Ser Tyr Met Leu Val 

845 850 855 

Leu Ala Glu Ala Gin He Gin Gin Trp Val Pro Pro Leu Glu Val 

860 865 870 

Arg Gly Gly Arg Asp Gly He He Trp Val Ala Val He Leu His 

875 880 885 

Pro Arg Leu Val Phe Glu Val Thr Lys Trp Leu Leu Ala lie Leu 

890 895 900 

Gly Pro Ala Tyr Leu Leu Arg Ala Ser Leu Leu Arg He Pro Tyr 

905 910 915 

Phe Val Arg Ala His Ala Leu Leu Arg Val Cys Thr Leu Val Lys 

920 925 930 

His Leu Ala Gly Ala Arg Tyr lie Gin Het Leu Leu He Thr lie 

935 940 945 

Gly Arg Trp Thr Gly Thr Tyr He Tyr Asp His Leu Ser Pro Leu 

950 955 960 

Ser Thr Trp Ala Ala Gin Gly Leu Arg Asp Leu Ala He Ala Val 

965 970 975 

Glu Pro Val Val Phe Ser Pro Met Glu Lys Lys Val He Val Trp 

980 985 990 

Gly Ala Glu Thr Val Ala Cys Gly Asp He Leu His Gly Leu Pro 

995 1000 1005 

Val Ser Ala Arg Leu Gly Arg Glu Val Leu Leu Gly Pro Ala Asp 

1010 1015 1020 

Gly Tyr Thr Ser Lys Gly Trp Asn Leu Leu Ala Pro lie Thr Ala 

1025 1030 1035 
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Tyr Thr Gin G i n Thr Arg Gly Leu Leu Gly Ala He Val Val Ser 

1040 1045 1050 

Leu Thr Gly Arg Asp Lys Asn Glu Gin Ala Gly Gin Val Gin Val 

1055 1060 1065 

Leu Ser Ser Val Thr Gin Thr Phe Leu Gly Thr Ser He Ser Gly 

1070 1075 1080 

Val Leu Trp Thr Val Tyr His Gly Ala Gly Asn Lys Thr Leu Ala 

1085 1090 1095 

Gly Pro Lys Gly Pro Val Thr Gin Met Tyr Thr Ser Ala Glu Gly 

1100 1105 1110 

Asp Leu Val Gly Trp Pro Ser Pro Pro Gly Thr Lys Ser Leu Asp 

1115 1120 1125 

Pro Cys Thr Cys Gly Ala Val Asp Leu Tyr Leu Val Thr Arg Asn 

1130 1135 1140 

Ala Asp Val Me Pro Val Arg Arg Lys Asp Asp Arg Arg Gly Ala 

1145 1150 1155 

Leu Leu Ser Pro Arg Pro Leu Ser Thr Leu Lys Gly Ser Ser Gly 

1160 1165 1170 

Gly Pro Val Leu Cys Ser Arg Gly His Ala Val Gly Leu Phe Arg 

1175 1180 1185 

Ala Ala Val Cys Ala Arg Gly Val Ala Lys Ser He Asp Phe He 

1190 1195 1200 

Pro Val Glu Ser Leu Asp lie Ala Thr Arg Thr Pro Ser Phe Ser 

1205 1210 1215 

Asp Asn Ser Ala Pro Pro Ala Val Pro Gin Ser Tyr Gin Val Gly 

1220 1225 1230 

Tyr Leu His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro 

1235 1240 1245 

Ala Ala Tyr Ala Ser Gin Gly Tyr Lys Val Leu Val Leu Asn Pro 
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1250 1255 1260 

Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Het Ser Lys Ala 
1265 1270 1275 

His Gly He Asn Pro Asn He Arg Thr Gly Val Arg Thr Val Thr 
1280 1285 1290 

Thr Gly Asp Ser He Thr Tyr Ser Thr Tyr Gly Lys Phe He Ala 
1295 1300 1305 

Asp Gly Gly Cys Ala Ala Gly Ala Tyr Asp lie He He Cys Asp 
1310 1315 1320 

Glu Cys His Ser Val Asp Ala Thr Thr He Leu Gly He Gly Thr 
1325 1330 1335 

Val Leu Asp Gin Ala Glu Thr Ala Gly Val Arg Leu Val Val Leu 
1340 1345 1350 

Ala Thr Ala Thr Pro Pro Gly Thr Val Thr Thr Pro His Ser Asn 
1355 1360 1365 

lie Glu Glu Val Ala Leu Gly His Glu Gly Glu He Pro Phe Tyr 
1370 1375 1380 

Gly Lys Ala He Pro Leu Ala Phe He Lys Gly Gly Arg His Leu 
1385 1390 1395 

He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Ala 
1400 1405 U10 

Leu Arg Gly Thr Gly Val Asn Ala Val Ala Tyr Tyr Arg Gly Leu 
1415 1420 1425 

Asp Val Ser Val He Pro Thr Gin Gly Asp Val Val Val Val Ala 
1430 1435 1440 

Thr Asp Ala Leu Het Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val 
1445 1450 1455 

He Asp Cys Asn Val Ala Val Ser Gin lie Val Asp Phe Ser Leu 
1460 1465 1470 
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Asp Pro Thr Phe Thr He Thr Thr Gin Thr Val Pro Gin Asp Ala 
1475 1480 1485 

Val Ser Arg Ser Gin Arg Arg Gly Arg Thr Gly Arg Gly Arg Leu 
U90 1495 1500 

Gly lie Tyr Arg Tyr Val Ser Ser Gly Glu Gly Pro Ser Gly Met 
1505 1510 1515 

Phe Asp Ser Val Val Pro Cys Glu Cys Tyr Asp Ala Gly Ala Ala 
1520 1525 1530 

Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala 
1535 1540 1545 

Tyr Phe Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu 
1550 1555 1560 

Phe Trp Glu Ala Val Phe Thr Gly Leu Thr His He Asn Ala His 
1565 1570 1575 

Phe Leu Ser Gin Thr Lys Gin Gly Gly Glu Asn Phe Ala Tyr Leu 
1580 1585 1590 

Thr Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Lys Ala Pro Pro 
1595 1600 1605 

Pro Ser Trp Asp Val Met Trp Lys Cys Leu Thr Arg Leu Lys Pro 
1610 1615 1620 

Thr Leu Thr Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val 
1625 1630 ' 1635 

Thr Asn Glu Val Thr Leu Thr His Pro Val Thr Lys Tyr He Ala 
1640 1645 1650 

Thr Cys Met Gin Ala Asp Leu Glu He Met Thr Ser Ser Trp Val 
1655 1660 1665 

Leu Ala Gly Gly Val Leu Ala Ala Val Ala Ala Tyr Cys Leu Ala 
1670 1675 1680 

Thr Gly Cys He Ser He He Gly Arg Leu His Leu Asn Asp Arg 
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1685 1690 1695 

Val Val Val Thr Pro Asp Lys Glu He Leu Tyr Glu Ala Phe Asp 
1700 1705 1710 

Glu Met Glu Glu Cys Ala Ser Lys Ala Ala Leu He Glu Glu Gly 
1715 1720 1725 

Gin Arg Het Ala Glu Het Leu Lys Ser Lys He Gin Gly Leu Leu 
1730 1735 1740 

Gin Gin Ala Thr Arg Gin Ala Gin Gly Het Gin Pro Ala He Gin 
1745 1750 1755 

Ser Ser Trp Pro Lys Leu Glu Gin Phe Trp Ala Lys His Het Trp 
1760 1765 1770 

Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu 
1775 1780 1785 

Pro Gly Asn Pro Ala Val Ala Ser Het Het Ala Phe Ser Ala Ala 
1790 1795 1800 

Leu Thr Ser Pro Leu Pro Thr Ser Thr Thr He Leu Leu Asn He 
1805 1810 1815 

Het Gly Gly Trp Leu Ala Ser Gin He Ala Pro Pro Ala Gly Ala 
1820 1825 1830 

Thr Gly Phe Val Val Ser Gly Leu Val Gly Ala Ala Val Gly Ser 
1835 1840 1845 

He Gly Leu Gly Lys He Leu Val Asp Val Leu Ala Gly Tyr Gly 
1850 1855 1860 

Ala Gly lie Ser Gly Ala Leu Val Ala Phe Lys He Het Ser Gly 
1865 1870 1875 

Glu Lys Pro Thr Val Glu Asp Val Val Asn Leu Leu Pro Ala He 
1880 1885 1890 

Leu Ser Pro Gly Ala Leu Val Val Gly Val He Cys Ala Ala He 
1895 1900 1905 
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Leu Arg Arg His Val Gly Gin Gly Glu Gly Ala Val Gin Trp Met 

1910 1915 1920 

Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His Val Ala Pro 

1925 1930 1935 

Thr His Tyr Val Val Glu Ser Asp Ala Ser Gin Arg Val Thr Gin 

1940 1945 1950 

Val Leu Ser Ser Leu Thr He Thr Ser Leu Leu Arg Arg Leu His 

1955 1960 1965 

Ala Trp He Thr Glu Asp Cys Pro He Pro Cys Ser Gly Ser Trp 

1970 1975 1980 

Leu Gin Asp He Trp Asp Trp Val Cys Ser He Leu Thr Asp Phe 

1985 1990 1995 

Lys Asn Trp Leu Ser Ser Lys Leu Leu Pro Lys Het Pro Gly He 

2000 2005 2010 

Pro Phe He Ser Cys Gin Lys Gly Tyr Lys Gly Val Trp Ala Gly 

2015 2020 2025 

Thr Gly Val Het Thr Thr Arg Tyr Pro Cys Gly Ala Asn He Ser 

2030 2035 2040 

Gly His Val Arg Het Gly Thr Het Lys He Thr Gly Pro Lys Thr 

2045 2050 2055 

Cys Leu Asn Leu Trp Gin Gly Thr Phe Pro He Asn Cys Tyr Thr 

2060 2065 2070 

Glu Gly Pro Cys Val Pro Lys Pro Pro Pro Asn Tyr Lys Thr Ala 

2075 2080 2085 

He Trp Arg Val Ala Ala Ser Glu Tyr Val Glu Val Thr Gin His 

2090 2095 2100 

Gly Ser Phe Ser Tyr Val Thr Gly Leu Thr Ser Asp Asn Leu Lys 

2105 2110 2115 

Val Pro Cys Gin Val Pro Ala Pro Glu Phe Phe Ser Trp Val Asp 
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2120 212S 2130 

Gly Val Gin He His Arg Phe Ala Pro Val Pro Gly Pro Phe Phe 

2135 2140 2145 

Arg Asp Glu Val Thr Phe Thr Val Gly Leu Asn Ser Phe Val Val 

2150 2155 2160 

Gly Ser Gin Leu Pro Cys Asp Pro Glu Pro Asp Thr Glu Val Leu 

2165 2170 2175 

Ala Ser Met Leu Thr Asp Pro Ser His He Thr Ala Glu Ala Ala 

2180 2185 2190 

Ala Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Gin Ala Ser Ser 

2195 2200 2205 

Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr 

2210 2215 2220 

Thr His Lys Thr Ala Tyr Asp Cys Asp Het Val Asp Ala Asn Leu 

2225 2230 2235 

Phe Het Gly Gly Asp Val Thr Arg lie Glu Ser Asp Ser Lys Val 

2240 2245 2250 

He Val Leu Asp Ser Leu Asp Ser Het Thr Glu Val Glu Asp Asp 

2255 2260 2265 

Arg Glu Pro Ser Val Pro Ser Glu Tyr Leu He Lys Arg Arg Lys 

2270 2275 2280 

Phe Pro Pro Ala Leu Pro Pro Trp Ala Arg Pro Asp Tyr Asn Pro 

2285 2290 2295 

Val Leu He Glu Thr Trp Lys Arg Pro Gly Tyr Glu Pro Pro Thr 

2300 2305 2310 

Val Leu Gly Cys Ala Leu Pro Pro Thr Leu Gin Thr Pro Val Pro 

2315 2320 2325 

Pro Pro Arg Arg Arg Arg Ala Lys He Leu Thr Gin Asp Asp Val 

2330 2335 2340 
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Glu Gly Me Leu Arg Glu Het Ala Asp Lys Val Leu Ser Pro Leu 
2345 2350 2355 

Gin Asp Asn Asn Asp Ser Gly His Ser Thr Gly Ala Asp Thr Gly 
2360 2365 2370 

Gly Asp He Val Gin Gin Pro Ser Asp Glu Thr Ala Ala Ser Glu 
2375 2380 2385 

Ala Gly Ser Leu Ser Ser Het Pro Pro Leu Glu Gly Glu Pro Gly 
2390 2395 2400 

Asp Pro Asp Leu Glu Phe Glu Pro Val Gly Ser Ala Pro Pro Ser 
2405 2410 2415 

Glu Gly Glu Cys Glu Val He Asp Ser Asp Ser Lys Ser Trp Ser 
2420 2425 2430 

Thr Val Ser Asp Gin Glu Asp Ser Val He Cys Cys Ser Het Ser 
2435 2440 2445 

Tyr Ser Trp Thr Gly Ala Leu He Thr Pro Cys Gly Pro Glu Glu 
2450 2455 2460 

Glu Lys Leu Pro He Asn Pro Leu Ser Asn Ser Leu Het Arg Phe 
2465 2470 2475 

His Asn Lys Val Tyr Ser Thr Thr Ser Arg Ser Ala Ser Leu Arg 
2480 2485 2490 

Ala Lys Lys Val Thr Phe Asp Arg Val Gin Val Leu Asp Ala His 
2495 2500 2505 

Tyr Asp Ser Val Leu Gin Asp Val Lys Arg Ala Ala Ser Lys Val 
2510 2515 2520 

Gly Ala Arg Leu Leu Thr Val Glu Glu Ala Cys Ala Leu Thr Pro 
2525 2530 2535 

Pro His Ser Ala Lys Ser Arg Tyr Gly Phe Gly Ala Lys Glu Val 
2540 2545 2550 

Arg Ser Leu Ser Arg Arg Ala Val Asn His He Arg Ser Val Trp 
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2555 2560 2565 

Glu Asn Leu Leu Glu Asp Gin Arg Thr Pro He Asp Thr Thr He 

2570 2575 2580 

Met Ala Lys Asn Glu Val Phe Cys He Asp Pro Thr Lys Gly Gly 

2585 2590 2595 

Lys Lys Pro Ala Arg Leu He Val Tyr Pro Asp Leu Gly Val Arg 

2600 2605 2610 

Val Cys Glu Lys Met Ala Leu Tyr Asp He Thr Gin Lys Leu Pro 

2615 2620 2625 

Lys Ala He Met Gly Pro Ser Tyr Gly Phe Gin Tyr Ser Pro Ala 

2630 2635 2640 

Glu Arg Val Asp Phe Leu Leu Lys Ala Trp Gly Ser Lys Lys Asp 

2645 2650 2655 

Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val 

2660 2665 2670 

Thr Glu Arg Asp lie Arg Thr Glu Glu Ser He Tyr Gin Ala Cys 

2675 2680 2685 

Ser Leu Pro Gin Glu Ala Arg Thr Val He His Ser Leu Thr Glu 

2690 2695 2700 

Arg Leu Tyr Val Gly Gly Pro Met Thr Asn Ser Lys Gly Gin Ser 

2705 2710 2715 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Phe Thr Thr Ser 

2720 2725 2730 
Het Gly Asn Thr Met Thr Cys Tyr He Lys Ala Leu Ala Ala Cys 

2735 2740 2745 

Lys Ala Ala Gly lie Val Asp Pro Val Het Leu Val Cys Gly Asp 

2750 2755 2760 
Asp Leu Val Val He Ser Glu Ser Gin Gly Asn Glu Glu Asp Glu 

2765 2770 2775 
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Arg Asn Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala 

2780 2785 2790 

Pro Pro Gly Asp Leu Pro Arg Pro Glu Tyr Asp Leu Glu Leu He 

2795 2800 2805 

Thr Ser Cys Ser Ser Asn Val Ser Val Ala Leu Asp Ser Arg Gly 

2810 2815 2820 

Arg Arg Arg Tyr Phe Leu Thr Arg Asp Pro Thr Thr Pro He Thr 

2825 2830 2835 

Arg Ala Ala Trp Glu Thr Val Arg His Ser Pro Val Asn Ser Trp 

2840 2845 2850 

Leu Gly Asn He He Gin Tyr Ala Pro Thr lie Trp Val Arg Het 

2855 2860 2865 

Val lie Het Thr His Phe Phe Ser He Leu Leu Ala Gin Asp Thr 

2870 2875 2880 

Leu Asn Gin Asn Leu Asn Phe Glu Het Tyr Gly Ala Val Tyr Ser 

2885 2890 2895 

Val Asn Pro Leu Asp Leu Pro Ala He He Glu Arg Leu His Gly 

2900 2905 2910 

Leu Glu Ala Phe Ser Leu His Thr Tyr Ser Pro His Glu Leu Ser 

2915 2920 2925 

Arg Val Ala Ala Thr Leu Arg Lys Leu Gly Ala Pro Pro Leu Arg 

2930 2935 2940 

Ala Trp Lys Ser Arg Ala Arg Ala Val Arg Ala Ser Leu He Ala 

2945 2950 2955 

Gin Gly Ala Arg Ala Ala He Cys Gly Arg Tyr Leu Phe Asn Trp 

2960 2965 2970 

Ala Val Lys Thr Lys Leu Lys Leu Thr Pro Leu Pro Glu Ala Ser 

2975 2980 2985 

Arg Leu Asp Leu Ser Gly Trp Phe Thr Val Gly Ala Gly Gly Gly 
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